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.  SECf  ION  1 
INTRODUCTION 

Neural  network  models  are  highly  parallel  alternatives  to  conventional  methods  of 
computation  for  solving  such  ill-structured  problems  as  pattern  recognition  or  robotic  control. 
Their  architecture  resembles  that  of  biological  nervous  systems,  which  solve  such  problems  so 
efficiently.  Sophisticated  neural  network  models  and  training  procedures  have  evolved,  since  the 
pioneering  work  of  the  1940s.  Even  so,  neural  network  models  have  not  yet  convincingly 
demonstrated  their  superiority  over  the  algorithm-oriented  approaches  of  artificial  intelligence  or 
classical  statistical  techniques.  Widespread  application  of  neural  network  technology  depends  on 
advances  in  both  theoretical  understanding  and  the  development  of  computing  platforms  explicitly 
designed  for  the  parallel  implementation  of  such  models.  Those  areas  of  research  naturally  interact 
and  complement  each  other.  For  example,  scaleup  and  convergence  issues  can  be  studied  both 
theoretically  and  by  actual  operation  of  neural  networks. 

In  the  1988-1989  contract  period,  we  addressed  the  need  for  hardware  realizations  of  neural 
networks  by  implementing  a  hybrid  optoelectronic  architecture.  The  massive  parallelism  of  neural 
network  models  that  makes  them  run  very  inefficiently  on  serial  machines  also  allows  them  to  be 
implemented  very  efficiently  on  parallel  optical  machines.  The  potential  speedup  factors  are  high. 

Our  approach  is  a  direct  analog  realization  of  neural  network  models  in  which  a  physical  node 
is  dedicated  to  representing  each  "neuron,"  or  processing  element.  By  not  having  to  multiplex 
neurons  among  physical  nodes  and  by  simultaneously  updating  all  weights  between  two  layers, 
using  optics,  we  achieve  very  large  throughputs  with  large  numbers  of  relatively  slow  processors, 
as  in  biological  nervous  systems.  The  stimulated  photorefractive  optical  neural  network 
(SPONN),  developed  under  this  contract  in  1988-1989,  is  a  fine-grained  optoelectronic 
architecture  characterized  by  massive  parallelism  and  much  greater  connectivity  than  is  possible  in 
electronic  approaches. 

SPONN  is  capable  of  implementing  neural  network  models  comprising  105  neurons  with 
10 10  interconnections.  SPONN's  optical  architecture  is  inherently  suited  to  the  mapping  of 
multilayer  neural  network  models;  moreover,  it  is  easily  programmable.  Its  weight  updating  rate  is 
independent  of  the  number  of  neurons.  In  contrast,  most  electronic  approaches  must  deal  with  data 
routing  and  contention  problems  arising  from  the  limited  connectivity  of  electronic  structures,  and 
therefore  depend  strongly  on  the  number  of  neurons  and  their  interconnections. 

In  SPONN,  neurons  are  implemented  as  pixels  on  a  two-dimensional  spatial  light  modulator 
(SLM)  and  interconnection  weights  are  established  holographically  as  gratings  in  photorefractive 
crystals.  A  unique  feature  is  our  use  of  a  continuum  of  spatially  and  angularly  distributed  gratings 
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to  represent  each  weight,  rather  than  the  single  grating  employed  by  prior  holographic  optical 
neural  networks.  Multiple  gratings  eliminate  the  ambiguous  readout  of  gratings  and  the  crosstalk 
that  results  from  the  angular  degeneracy  of  the  Bragg  condition  for  diffraction  from  a  volume 
grating.  Our  new  technique  eliminates  the  need  for  subsampling  the  input/output  planes  and 
therefore  permits  full  utilization  of  the  SLM  space-bandwidth  product  for  representing  neurons, 
unlike  other  holographic  approaches.  We  can  implement  multilayer  neural  network  models  using  a 
single  photorefractive  crystal  and  SLM,  which  produces  a  compact  modular  system. 

The  continuum  of  gratings  is  generated  by  focusing  the  input  plane  into  a  self-pumped  or 
mutually-pumped  phase-conjugate  mirror  (PCM).  Stimulated  photorefractive  processes  in  the 
PCM  cause  each  pixel  in  the  input  plane  to  form  connections  with  all  other  pixels  via  distributed 
volume  gratings.  Moreover,  the  gratings  arrange  themselves  to  redistribute  the  incident  light  into  a 
phase-conjugate  output  wavefront  that  is  a  time-reversed  version  of  the  input  light.  Such  self- 
organization  yields  a  fully  parallel  and  massively  interconnected  physical  system  that  is  an  ideal 
implementation  medium  for  neural  network  models.  The  distributed  gratings  in  the  PCM  both 
store  the  weights  and  route  the  optical  beams. 

An  important  feature  of  SPONN  is  its  hybridization  of  optics  and  electronics.  It  combi  .s 
the  large  storage  capacity,  parallelism,  and  connectivity  of  optical  structures  with  the  easy 
programmability  and  controllable  nonlinearity  of  electronic  structures.  A  video  frame  grabber  in 
conjunction  with  the  host  computer  carries  out  the  nonlinear  neuron  activation  functions  with 
minimal  computational  overhead.  We  can  implement  multilayer  neural  networks  by  spatially 
segregating  the  input  and  output  planes.  Unlike  all-optical  neural  networks,  SPONN  can  be  easily 
and  reproducibly  controlled. 
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.  SECTION  2 

VOLUME  HOLOGRAM  IMPLEMENTATIONS  OF  NEURAL  NETWORKS 

Volume  holograms  offer  two  features  required  by  neural  networks:  enormous  storage 
capacity  and  fully  parallel  processing  of  the  stored  interconnection  weight  values.  In  such  an 
optical  neural  network,  neurons  are  represented  by  pixels  on  two-dimensional  SLMs.  Pixel 
brightness  corresponds  to  the  activation  level  of  the  neuron.  When  the  SLM  is  placed  in  the  back 
focal  plane  of  a  lens  and  coherent  readout  is  used,  the  light  emitted  by  the  pixels  is  convened  to 
coherent  beams  that  illuminate  a  real-time  holographic  medium. 

In  this  report,  we  represent  each  light  beam  by  a  momentum  or  k  vector.  (The  direction  of 
the  k  vector  corresponds  to  the  direction  of  propagation;  its  magnitude  is  the  inverse  of  the  optical 
wavelength  in  the  holographic  medium.)  Interconnection  weights  between  neurons  are  established 
when  a  pair  of  light  beams  interfere  in  the  holographic  medium,  producing  a  volume  sinusoidal 
light-intensity  pattern  that  interacts  with  the  medium.  The  photorefractive  effect  is  a  suitable 
physical  mechanism  for  converting  the  light-intensity  pattern  into  a  semipermanent  deformation  of 
the  optical  properties  of  the  material,  thus  recording  the  weight  values. 

In  the  photorefractive  effect,  incident  light  excites  carriers  (electrons,  holes)  from  traps  into 
the  conduction  or  valence  band.  The  carriers  are  then  transported  by  diffusion  and  drift  until  they 
fall  into  empty  traps,  thereby  creating  an  internal  space-charge  field  that  in  turn  modulates  the 
birefingence  of  the  material  through  the  electro-optic  effect.1  Because  of  the  long  dark-decay  times 
of  some  photorefractive  materials,  the  resultant  phase  gratings  can  be  stored  with  a  time  constant  of 
many  hours.2  (Storage  for  longer  periods  is  also  possible  using  various  hologram-fixing  methods, 
discussed  below  in  subsection  3.7.) 

When  one  of  the  original  two  beams  subsequently  addresses  the  grating,  the  other  beam  is 
reconstructed  with  a  diffraction  efficiency  that  represents  the  interconnection  weight  value  between 
the  two  neurons.  In  general,  reading  out  the  grating  partially  erases  it  unless  the  readout  beam  is 
much  weaker  than  the  original  writing  light  or  the  crystal  is  fixed  by  means  of  special  techniques. 
Such  light  sensitivity  allows  us  to  implement  learning  in  our  photorefractive  optical  neural 
network,  since  we  can  selectively  decrease  as  well  as  increase  the  weights.  Photorefractive 
materials  and  their  application  in  optical  data  processing  is  an  active  area  of  research  at  HRL. 

The  physical  mechanism  that  allows  large  numbers  of  gratings  to  be  stored  in  a 
photorefractive  crystal  is  described  by  the  Bragg  condition  for  constructive  scattering  off  a  volume 
grating:  a  beam  will  be  reconstructed  only  if  its  angle  of  incidence  is  approximately  equal  to  that  of 
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the  original  writing  beam.  The  angular  selectivity  for  reconstruction  can  be  derived  from  coupled 
mode  theory.3  It  is  given  by 


A0  = - & - 

nTzsin(<j>) 

where  X  is  the  optical  wavelength,  n  is  the  index  of  refraction  of  the  photorefractive  crystal,  Tz  is 
the  hologram  thickness,  and  4>  is  the  mean  angle  between  the  reference  and  object  beamlets.  The 
angular  selectivity  is  greater  for  thicker  crystals.  Phase  matching  arguments  permit  the  Bragg 
condition  to  be  described  geometrically  as  a  vector  sum:  Kj  +  Kg  =  Kj,  where  Kj  and  K}  are  the 
wave  vectors  of  the  incident  and  diffracted  beams,  respectively,  and  Kg  is  the  grating  wave  vector. 
Figure  1  illustrates  a  holographic  interconnection  between  two  optical  neurons,  with  Figure  1(a) 
showing  how  holographic  gratings  form  an  outer-product  or  Hebbian  interconnection  matrix  and 
Figure  1(b)  describing  the  Bragg  condition  geometrically. 

A  geometric  construction  for  the  theoretical  maximum  storage  capacity  of  a  volume  hologram 
can  be  drawn  in  k  space,  as  shown  in  Figure  2.  If  the  first  writing  beam  varies  over  solid  angle 
0O  whereas  the  second  writing  beam  varies  over  angle  0r,  then  the  vector  difference  between  the 
two  beams  (the  grating  wave  vector  Kg)  will  trace  out  a  three-dimensional  region  in  k  space.  The 
volume  of  the  region  depends  on  such  geometric  factors  as  the  focal  lengths  of  the  optics  and  the 
spacing  of  the  neurons  on  the  SLMs. 

The  grating  wave  vector  Xg  has  an  uncertainty  volume  associated  with  it  because  of  the  lens 
aperture,  the  finite  physical  size  of  the  hologram,  and  the  nonzero  size  of  the  SLM  pixels. 
Dividing  the  accessible  volume  of  k  space  by  the  value  of  the  uncertainty  volume  yields  the 
maximum  theoretical  number  of  resolvable  gratings  or  weights  that  can  be  stored  in  the 
photorefractive  crystal.  For  a  1-cm3  crystal,  the  theoretical  upper  limit  is  1010  weights,  assuming 
currently  available  SLM  resolution  and  reasonable  optics.  That  number  of  weights  is  sufficient  to 
form  a  fully  interconnected  network  of  105  neurons.  Partially  interconnected  networks  with  more 
neurons  can  also  be  accommodated.  Moreover,  the  entire  neural  network  can  be  read  out  or 
updated  in  parallel  without  the  time-multiplexing,  data-contention,  or  bottleneck  problems  common 
in  electronic  implementations.  The  great  storage  cap'  ,ity  is  a  direct  result  of  the  three-dimensional 
nature  of  optical  holographic  storage.4 
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Figure  2.  Region  of  k  space  used  for  information  storage  in  optical  neural  network  based  on 
volume  hologram. 
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Tn*'»  hallenge  is  to  devise  practical  neural  network  systems  able  to  approach  the  theoretical 
limit.  Perhaps  the  most  important  obstacle  is  the  degenerate  nature  of  the  Bragg  condition,  which 
states  that  the  angle  of  incidence  of  a  light  beam  relative  to  a  volume  grating  must  match  that  of  the 
original  writing  beam  in  order  for  its  associated  beam  to  be  reconstructed.  However,  that  condition 
is  satisfied  by  a  set  of  beams  whose  k  vectors  form  a  cone  normal  to  the  grating,  as  shown  by 
Figure  3.  Therefore,  a  large  set  of  beams  other  than  the  original  beam  can  constructively  scatter 
off  the  grating,  forming  erroneous  reconstructions  and  crosstalk.  Two  methods  for  avoiding  the 
problem  have  been  suggested  in  the  literature:  subsampling  of  the  SLMs  and  spatial  multiplexing 
of  holograms.  However,  both  methods  are  problematic. 

In  the  subsampling  method,  neurons  are  arranged  in  special  nonredundant  patterns  on  the 
SLMs,  and  output  planes  are  sampled  only  at  certain  locations.  Thus,  though  false  reconstructions 
still  occur,  they  do  not  contribute  to  the  output.  The  special  patterns  can  consist  of  fractal  grids5  or 
a  combination  of  one-  and  two-dimensional  sampling.6  If  the  SLMs  are  capable  of  displaying 
NxN  neurons,  then  this  method  can  implement  a  total  of  N3/2  neurons  and  N3  weights.  The 
storage  capacities  of  both  the  crystal  and  the  SLMs  thus  have  the  same  functional  dependence  on 
dimensional  scaling  (ignoring  limiting  effects  due  to  nonzero  SLM  pixel  size). 

However,  the  subsampling  method  does  not  allow  the  SLM  space-bandwidth  product  to  be 
fully  utilized  for  representing  neurons.  That  is  a  major  drawback.  As  discussed  above,  the  storage 
capacity  of  a  1-cm3  crystal  should  be  sufficient  to  store  the  interconnections  for  an  NxN  array  of 
neurons  where  N  =  500,  which  matches  the  capabilities  of  current  SLMs  such  as  the  HRL  liquid 
crystal  light  valve  (LCLV).  Unfortunately,  because  of  the  subsampling,  only  N3#  neurons  can  be 
implemented  even  though  the  SLM  is  capable  of  displaying  N2  neurons.  Since  N  =  500,  the 
neuron  and  weight  storage  capacity/throughput  are  reduced  by  factors  of  22  and  500,  respectively, 
from  the  theoretical  maximums  for  current  SLMs.  The  light  efficiency  is  also  low  because  some  of 
the  light  is  diffracted  to  dead  areas  due  to  the  Bragg  ambiguity. 

The  spatial  multiplexing  method  avoids  the  Bragg  ambiguity  problem  by  physically  dividing 
the  crystal  into  separate  volumes  for  each  weight.  However,  such  divisions  effectively  reduce  the 
storage  capacity  to  the  low  level  of  a  two-dimensional  hologram. 

Figure  4  shows  the  general  architecture  of  a  subsampling  photorefractive  optical  neural 
network.  Pixels  in  the  object  and  reference  planes  represent  individual  neurons.  The  neurons  are 
optically  interconnected  by  coherent  light  beams  diffracted  from  volume  phase  gratings,  which  are 
stored  in  a  photorefractive  crystal  and  which  control  the  strength  and  phase  of  the  interconnection 
weights.  The  object  and  reference  planes  are  physically  located  on  the  output  faces  of  CRT- 
addressed  LCLVs  that  modulate  incident,  collimated  coherent  light  beams,  resulting  in  the 
reflection  of  diverging  beamlets  of  light  from  each  neuron.  The  amplitude  of  each  beamlet  is 
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CROSSTALK 


Figure  3.  Geometric  construction  of  Bragg  ambiguity  in  single-grating-per-weight  storage. 

Many  wave-vector  pairs  can  read  out  each  grating,  but  this  architecture  restricts  the 
arrangement  of  neurons  on  the  SLM. 
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Figure  4.  Photorefractive  optical  neural  network  produced  by  subsampling  method. 
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controlled  by  the  activation  level  of  the  neuron.  The  object  and  reference  beamlets  are  collimated 
by  two  lenses  of  focal  length  F.  The  collimated  object  and  reference  beamlets  are  incident  on  the 
photorefractive  crystal,  where  they  interfere  to  form  volume  gratings  and  thus  determine  the 
interconnection  weights.  Such  holographic  gratings  form  an  outer-product  or  Hebbian 
interconnection  matrix  [see  Figure  1(a)]  between  the  object  and  reference  planes.  Positive, 
negative,  or  complex  contributions  to  the  interconnection  weights  can  be  implemented  using  two 
exposure  stages,  with  different  phases  of  the  LCLV  readout  light 

During  readout,  light  from  a  particular  neuron  is  diffracted  from  a  photorefractive  grating, 
sampled  by  a  beam  splitter,  and  focused  onto  a  detector  plane,  which  can  be  a  charge-coupled 
device  (CCD)  or  vidicon  video  camera.  The  object/reference  detector  planes  are  then  optically  or 
electronically  mapped  onto  the  object/reference  neuron  planes. 

Reflection  holograms  would  be  formed,  as  in  Figure  4.  That  result  is  usually  undesirable 
because  the  diffracted  light  from  the  reference  plane  would  not  propagate  in  the  desired  directic  .1  to 
reconstruct  the  object  plane.  However,  an  additional  component,  a  phase-conjugate  mirror,  can  be 
introduced  to  phase-conjugate  the  light  from  the  reference  plane  after  it  passes  through  the 
hologram.  The  reference  and  object  beamlets  then  form  a  transmission  hologram  that  in  turn 
produces  a  real  image  of  the  object  plane  when  the  hologram  is  illuminated  with  light  from  the 
reference  plane.  Using  a  PCM  to  allow  a  single  SLM  to  both  expose  and  read  gratings  in  an 
optical  neural  network  was  first  suggested  by  Wagner  and  Psaltis. 

The  projected  performance  of  the  subsampled  system  can  be  estimated  by  analyzing  the  type 
and  amount  of  light.  Three  quantities  of  interest  can  be  defined:  N,  the  total  number  of  neurons; 
NCOnn»  the  total  number  of  interconnections;  and  R,  the  interconnection  updating  rate.  N  is  simply 
given  by  the  LCLV  active  area  divided  by  the  square  of  the  neuron  separation: 

„  *(Dg)2 

(Ax)2 

where  D  is  the  diameter  of  the  LCLV  active  area  and  Ax  is  the  neuron  separation,  which  is 
determined  by  the  angular  selectivity  of  the  volume  hologram,  A9,  and  the  focal  length  F: 

Ax  =  2FA0 

where  the  expression  for  A0  was  given  previously.  Combining  the  above  expressions  results  in 
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controlled  by  the  activation  level  of  the  neuron.  Tne  object  and  reference  beamlets  are  collimated 
by  two  lenses  of  focal  length  F.  The  collimated  object  and  reference  beamlets  are  incident  on  the 
photorefractive  crystal,  where  they  interfere  to  form  volume  gratings  and  thus  determine  the 
interconnection  weights.  Such  holographic  gratings  form  an  outer-product  or  Hebbian 
interconnection  matrix  [see  Figure  1(a)]  between  the  object  and  reference  planes.  Positive, 
negative,  or  complex  contributions  to  the  interconnection  weights  can  be  implemented  using  two 
exposure  stages,  with  different  phases  of  the  LCLV  readout  light. 

During  readout,  light  from  a  particular  neuron  is  diffracted  from  a  photorefractive  grating, 
sampled  by  a  beam  splitter,  and  focused  onto  a  detector  plane,  which  can  be  a  charge-coupled 
device  (CCD)  or  vidicon  video  camera.  The  object/reference  detector  planes  are  then  optically  or 
electronically  mapped  onto  the  object/reference  neuron  planes. 

Reflection  holograms  would  be  formed,  as  in  Figure  4.  That  result  is  usually  undesirable 
because  the  diffracted  light  from  the  reference  plane  would  not  propagate  in  th^  desired  direction  to 
reconstruct  the  object  plane.  However,  an  additional  component,  a  phase-conjugate  mirror,  can  be 
introduced  to  phase-conjugate  the  light  from  the  reference  plane  after  it  passes  through  the 
hologram.  The  reference  and  object  beamlets  then  form  a  transmission  hologram  that  in  turn 
produces  a  real  image  of  the  object  plane  when  the  hologram  is  illuminated  with  light  from  the 
reference  plane.  Using  a  PCM  to  allow  a  single  SLM  to  both  expose  and  read  gratings  in  an 
optical  neural  network  was  first  suggested  by  Wagner  and  Psaltis. 

The  projected  performance  of  the  subsampled  system  can  be  estimated  by  analyzing  the  type 
and  amount  of  light.  Three  quantities  of  interest  can  be  defined:  N,  the  total  number  of  neurons; 
NCOnn>  total  number  of  interconnections;  and  R,  the  interconnection  updating  rate.  N  is  simply 
given  by  the  LCLV  active  area  divided  by  the  square  of  the  neuron  separation: 

N  n(D/2)2 

(Ax)2 

where  D  is  the  diameter  of  the  LCLV  active  area  and  Ax  is  the  neuron  separation,  which  is 
determined  by  the  angular  selectivity  of  the  volume  hologram,  A0,  and  the  focal  length  F: 

Ax  =  2FA0 

where  the  expression  for  A0  was  given  previously.  Combining  the  above  expressions  results  in 
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Substituting  all  of  the  above  expressions  into  R'  =  NCOnn/^E  results  in  a  final  expression  that 
contains  only  the  independent  parameters  of  the  system: 


R 


conn  ~ 


SJxlQ-^euron  (DnTzsin(<l)) 

^Wi%Idet  F  'Fdet' 


Assume  the  following  reasonable  values  for  the  independent  parameters: 


A, 

514  nm 

n 

2.5 

Tz 

5  mm 

<t> 

45  deg 

D 

50  mm 

dneuron 

50  ^m 

line 

5  mW/cm2 

Idet 

0.1  mW/cm2 

W(l%) 

6  mJ/cm2 

F 

500  mm 

Fdet 

275  mm 

Those  assumptions  yield  N  =  1.7xl05  neurons,  NConn  =  7xl07  interconnections,  and  R  =  lxlO7 
interconnections  processed  per  second  (tE  =  8  s).  This  is  the  weight  updating  rate.  Readout  of 

the  neural  net  would  occur  at  video  rates,  e.g.  the  corresponding  readout  rate  would  be  7xl07 
interconnections  divided  by  30  ms,  or  2xl09  interconnections  per  second.  Though  the  exposure 
time  te  is  relatively  long,  the  massive  parallelism  resulting  from  the  optical  interconnections  results 

in  a  very  high  processing  rate  comparing  favorably  with  that  of  electronic  implementations. 

The  assumed  values  for  the  independent  parameters  are  based  on  the  current  state  of  the  art 
for  LCLVs  and  commercially  available  detectors  without  cooling  or  image  intensification.  The 
value  of  Wi%  =  6  mJ/cm2  is  a  best-case  measured  value  for  BaTi03  with  an  applied  electric  field  of 
10  kV/cm.5  The  assumed  values  are  impressive  compared  with  the  corresponding  values  of 
electronic  implementations,  but  they  are  limited  by  the  subsampling  of  the  SLM  input  plane. 
Improved  storage  and  throughput  values  would  result  if  the  need  for  subsampling  to  avoid 
crosstalk  could  be  avoided. 
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3.1  SPONN  CONCEPT 


We  have  begun  experimental  verification  of  SPONN,  a  new  and  unique  alternative  method 
for  avoiding  the  Bragg  ambiguity  problem.  Without  sacrificing  parallelism,  it  makes  full  use  of  the 
SLM  space-bandwidth  product  and  provides  much  greater  storage  capacity  than  does  the 
subsampling  method.  The  essence  of  our  idea  is  to  store  each  weight  in  a  set  of  angularly  and 
spatially  multiplexed  gratings  rather  than  in  a  single  grating.  The  rejection  of  crosstalk  may  be 
greatly  increased  by  forcing  a  light  beam  to  match  the  Bragg  condition  at  each  of  a  series  of 
spatially  and  angularly  distributed  gratings,  as  shown  in  Figure  5(b).  An  undesirable  beam  on  the 
degenerate  cone  of  one  grating  (see  Figure  3  above)  is  rejected  by  the  remaining  gratings.  That 
rejection  allows  the  neurons  to  be  arranged  in  arbitrary  patterns  on  the  SLM,  increasing  both 
storage  capacity  and  throughput  as  well  as  offering  other  benefits  such  as  less  stringent  alignment 
requirements  and  the  use  of  the  same  crystal  for  both  beam  routing  and  storage  of  weights. 
Though  we  use  a  larger  fraction  of  the  hologram  space-bandwidth  product  to  store  each  weight,  the 
increased  storage  space  is  more  than  offset  by  the  improved  utilization  of  the  SLM  input  plane. 

The  physical  process  used  to  generate  multiple-grating  weight  representation  is  phase 
conjugation  based  on  stimulated  photorefractive  scattering.  Specifically,  we  propose  the  use  of  a 
self-  or  mutually-pumped  photorefractive  PCM  as  both  the  storage  element  and  the  beam  router  in  a 
programmable  optical  neural  network.  Basically,  self-pumped  phase  conjugation  starts  with  an 
image-bearing  optical  beam  focused  into  a  photorefractive  crystal.  Light  scattered  from  crystal 
inhomogeneities  will  write  gratings  by  interfering  with  the  incident  beam.  The  gratings  will  in  turn 
scatter  more  light  through  a  dynamic  two-wave  mixing  interaction  in  which  light  energy  is 
transferred  from  the  incident  beam  to  other  scattered  beams. 


If  the  relevant  electro-optical  coefficient  is  large  enough  and  the  interaction  length  long 
enough,  scattered  light  will  be  selectively  amplified  through  the  stimulated  photorefractive  gain 
mechanism,  which  can  be  easily  observed  as  beam  fanning  in  crystals  of  BaTiC>3.  Through 
reflections  of  the  fanned  light  at  crystal  comers7  or  through  photorefractive  backscattering8,  the 
stimulated  process  arranges  gratings  in  volume  distributions,  which  generate  the  phase-conjugate 
or  time-reversed  image  beam  propagating  backward  along  its  original  incident  direction. 


A  mutually  pumped  PCM  operates  similarly  except  that  two  image  beams  are  focused  into  the 
crystal.^  The  light  from  one  beam  forms  the  phase  conjugate  of  the  other  beam  and  vice  versa, 


though  the  two  beams  may  be  incoherent  with  respect  to  each  other.  Gratings  produced  by  the 
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Figure  5.  Use  of  multiple  gratings  to  reduce  crosstalk  resulting  from  Bragg  ambiguity, 

(a)  Single  grating,  (b)  Multiple  gratings. 


interference  of  each  beam  with  its  own  fanning  light  arrange  themselves  so  as  to  form  the  phase 
conjugates  of  the  two  incident  beams.  We  will  refer  to  self-  and  mutually-pumped  PCMs 
collectively  as  stimulated  PCMs,  or  SPCMs. 

The  key  point  for  our  SPONN  architecture  is  that  each  pixel  in  the  image  incident  on  the 
crystal  forms  gratings  with,  and  hence  is  connected  to,  many  other  pixels.  The  degree  of 
connectivity  can  be  adjusted  by  varying  the  position  of  the  crystal  relative  to  the  input  lens.  For 
example,  if  the  crystal  is  in  the  back  focal  plane  of  the  lens  where  the  Fourier  transform  of  the  input 
image  is  found,  light  from  each  pixel  will  overlap  with  light  from  all  other  pixels,  establishing  a 
very  large,  fully  interconnected  physical  system  suitable  for  implementing  fully  interconnected 
neural  network  models.  On  the  other  hand,  if  a  slightly  misfocused  version  of  the  input  image  is 
incident  on  the  crystal,  the  pixel  connections  will  be  more  localized,  allowing  neighborhood  neural 
network  operations  such  as  lateral  inhibition  to  be  implemented. 

Moreover,  the  distributed  gratings  form  precisely  the  continuum  of  spatially  and  angularly 
multiplexed  gratings  described  above  as  a  method  for  avoiding  Bragg  ambiguity  in  optical  neural 
networks.  Simultaneously,  the  gratings  produce  an  output  that  is  the  phase  conjugate  of  the  input, 
simplifying  the  optical  design  and  making  the  system  tolerant  of  component  imperfections  and 
variations.  The  following  subsections  will  discuss  the  architecture  and  operation  of  the  SPONN 
system  as  well  as  some  initial  experimental  results. 
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3.2  SPONN  ARCHITECTURE 

SPONN  systems  using  self-  and  mutually-pumped  PCMs  are  diagramed  in  Figures  6  and  7, 
respectively.  Neurons  are  represented  by  pixels  on  the  HRL-invented  LCLV10  an  SLM  capable  of 
displaying  105  pixels  at  video  frame  times  (33  ms).  In  Figure  7,  the  plane  of  neurons  is  divided 
into  sections,  Li,  L2,  L3,  L4,  each  of  which  represents  a  layer  in  the  neural  network.  Light  from 
the  optical  neurons  is  directed  into  the  self-pumped  PCM.  The  conjugate  return,  consisting  of  the 
input  summed  over  the  photorefractive  weights,  is  directed  by  a  beam  splitter  into  a  video  detector 
such  as  a  CCD  camera.  The  weighted  sums  are  passed  through  nonlinear  neuron-activation 
functions  electronically  at  video  rates  in  the  image  processor,  a  frame  grabber  with  nonlinear 
lookup  tables.  The  result  can  be  either  sent  to  the  host  computer  as  a  final  calculation  or  back  to  the 
PCM  through  the  LCLV  if  the  network  is  being  iteratively  trained. 

Incremental  weight  changes  follow  an  outer-product  or  Hebbian  learning  rule.  Multilayer 
neural  networks  are  implemented  by  devoting  separate  areas  of  the  LCLV  to  each  layer.  Large 
training  sets  of  exemplar  patterns  can  be  accessed  by  means  of  optical  or  magnetic  disk  image 
storage  technology.  For  example,  commercially  available  disk  technology  will  allow  us  to  access 
thousands  of  105 -pixel  exemplar  patterns  for  training  with  random  access  times  of  less  than 
200  ms  per  pattern. 

The  readout  time  of  an  optical  neural  network  (operative  mode)  in  SPONN  is  one  video 
frame  time  (33  ms).  The  current  limiting  factors  are  the  respons  time  of  the  LCLV  and  our  use  of 
commercially  available  image  processing  components  that  are  compatible  with  American  video 
engineering  standards.  The  time  required  to  modify  the  weights  (training  mode)  depends  on  the 
incident  light  intensity.  However,  for  readily  available  continuous-wave  (CW)  argon  laser  powers 
of  100  mW,  the  parallel  weight  modification  time  is  approximately  100  ms  for  crystals  of 
commercially  available  BaTiC>3  at  room  temperature.  (Other  HRL  researchers  have  successfully 
reduced  the  response  time  of  BaTi03  by  two  orders  of  magnitude  through  heating  to  120‘C.1 !)  A 
significant  advantage  of  SPONN  is  that  its  optical  parallelism  makes  the  readout  and  modification 
times  independent  of  the  neural  network  size.  With  room-temperature  BaTi03,  the  theoretical 
weight-processing  throughput  would  therefore  be  1011  interconnections  per  second  for  a  network 
of  105  neurons. 

Phase  conjugation  enables  the  weight  storage  method  to  compensate  for  optical  distortions 
and  simplify  the  optical  design  and  alignment.  The  only  critical  alignment  is  between  the  output 
and  detected  images,  but  that  can  be  performed  electronically  by  the  image  processor.  The  use  of  a 
single  photorefractive  crystal  is  another  beneficial  feature  of  SPONN,  especially  for  multilayer 
neural  network  models.  Since  coherent  interference  in  SPCMs  occurs  between  an  incident  beam 
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Figure  6.  Self-pumped  SPONN  with  electronic  thresholding. 
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Figure  7.  Three-layer  SPONN  using  single  mutually  pumped  PCM. 
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and  light  scattered  from  it,  SPONN  is  less  sensitive  to  vibration  than  other  coherent  interferometric 
optical  neural  networks  in  which  separate,  externally  generated  beams  propagating  over  large 
distances  must  be  held  stable  with  respect  to  each  other  to  within  a  fraction  of  a  wavelength. 

Compact,  rugged,  laser-diode-pumped  solid  state  lasers  with  large  output  powers  are 
becoming  commercially  available.12  They  could  replace  the  relatively  bulky,  water-cooled  argon 
laser  used  in  our  evaluation  experiments.  With  such  a  laser,  the  SPONN  system  would  occupy 
less  than  one  cubic  foot  and  be  able  to  implement  parallel  neural  networks  potentially  consisting  of 
up  to  1010  interconnections.  SPONN  also  lends  itself  to  modularity,  as  multiple  units  could  be 
connected  to  the  host  computer  bus,  as  shown  in  Figure  8.  Multiple  neural  networks  could  then 
execute  simultaneously  on  the  SPONN  modules,  with  cooperative  data  exchange  coordinated  by 
the  host  computer.  Such  a  system  could  be  readily  expanded  by  simply  adding  more  SPONN 
modules  to  the  host  bus. 


C8929- 06-05 


Figure  8.  Expandable  SPONN  architecture  for  parallel  implementation  of  multiple  neural 
network  modules  operating  cooperatively. 
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3.3  SPONN  WEIGHT  MODIFICATION 

Learning,  i.eM  weight  modification,  in  SPONN  is  accomplished  by  modifying  the 
interconnection  weights  between  neurons  through  changing  the  gratings  in  the  photorefractive 
material.  The  material  equations  of  Kukhtarev  et  al.13  permit  the  derivation  of  a  set  of  coupled 
differential  equations  that  describe  grating  formation  in  photorefractive  materials,  assuming  that 
two  complex  optical  amplitudes  Ap  and  As  (of  the  pump  and  scattered  beams,  respectively) 
interfere  coherently  to  form  the  space-charge  field  E:14 


^=-il(k/np)reff,pAsE-aAp 


=  -i j  (k/ns)  reff,sApE*  -  &AS 


£!£  +  &  iEscApAs 

3t  1  [t(|Aj*+|A,|>)] 


where  xp  and  xs  are  coordinates  along  the  directions  of  propagation  of  the  pump  and  scattered 
waves,  respectively,  np  and  ns  are  the  refractive  indices  in  those  directions,  k  is  the  optical  wave 
number,  reff  is  the  effective  electro-optical  coefficient,  a  is  the  optical  absorption  coefficient,  x  is 
the  space-charge  field  decay  rate,  and  Esc  is  a  function  of  the  material  constants  and  grating  wave 
number.  The  above  equations  cannot  begin  to  model  the  full  complexity  of  SPONN,  where  a  great 
number  of  beams  scatter  and  interact  with  each  other  in  order  to  form  a  connection  weight. 

A  more  realistic  model  would  require  the  solution  of  a  very  large  system  of  coupled 
differential  equations  consisting  of  a  set  of  equations  similar  to  those  above  for  each  grating  in  the 
crystal,  all  coupled  together.  The  boundary  conditions  would  depend  on  details  of  crystal 
geometry  and  inhomogeneities.  Nevertheless,  some  understanding  of  the  grating  dynamics 
relevant  to  weight  formation  can  be  obtained  from  the  above  equations  by  considering  a  single 
isolated  grating.  For  example,  the  above  equations  demonstrate  that  the  amplitude  diffraction 
efficiency  or  connection  weight  increases  with  the  grating  space-charge  field  E.  Also,  at  initial 
stages  of  grating  formation  (E  =  0),  the  rate  of  formation  is  proportional  to  the  product  of  the 
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writing  beam  amplitudes.  That  value  corresponds  to  outer-product  or  K^bbian  learning.  The 
steady-state  value  of  E,  obtained  by  setting  the  time  derivative  to  zero,  is  also  given  by  the  product 
of  the  steady-state  pump  and  scattered  beams. 

One  approach  to  learning  in  SPONN  is  to  first  initialize  the  connection  pathways  by  turning 
on  all  neurons.  That  establishes  the  gratings,  which  will  then  be  modified  during  learning.  C!nce 
the  time  required  to  form  gratings  in  a  blank  crystal  is  much  longer  than  the  grating  adjustment 
time,  initialization  also  improves  the  learning  rate.  The  gratings  are  adjusted  during  outer-product 
or  Hebbian  learning  in  SPONN  by  forming  outer  products  between  error  signals  and  the  input 
signals  in  the  previous  layer.  The  host  computer  calculates  error  signals  by  determining  the 
difference  between  the  actual  output  of  SPONN  and  the  desired  output,  as  discussed  in  subsection 
3.4.  Learning  is  conducted  at  a  rate  faster  than  the  photorefractive  response  time,  so  the  gratings 
are  never  in  equilibrium  with  the  error  signals.  Since  the  photorefractive  response  time  is  intensity- 
dependent,  SPONN  can  be  switched  from  the  learning  mode  to  the  readout  mode  simply  by 
reducing  the  readout  light  intensity.  Alternatively,  hologram-fixing  techniques  can  possibly  be 
used  for  nondestructive  readout. 

An  important  advantage  of  SPONN  is  its  ability  to  implement  bipolar  weights  that  can  be 
selectively  increased  or  decreased.  That  can  be  accomplished  in  several  ways.  For  example, 
shifting  the  phase  of  a  neuron  on  the  LCLV  in  turn  shifts  the  phase  of  the  gratings  written  by  that 
neuron  and  selectively  erases  connections  between  it  and  other  neurons.  Another  method  is  to  use 
two  sets  of  gratings  for  each  weight,  one  for  positive  weights  and  the  other  for  negative  weights. 
The  difference  between  the  two  weight  contributions  is  calculated  electronically  by  using  two  SLM 
pixels  per  neuron. 

3.4  EXPERIMENTAL  VERIFICATION  OF  SPONN  CONNECTIVITY 

We  have  experimentally  verified  SPONN  connectivity  using  the  apparatus  diagramed  in 
Figure  6.  Sample  SPONN  output  demonstrating  connectivity  is  shown  in  Figure  9  for  an 
arbitrary  abstract  array  of  1024  fully  switched  on  neurons.  Readout  with  a  partial  version  of  the 
training  image  fills  in  the  central  blank  area,  demonstrating  that  the  outermost  neurons  have  formed 
connections  with  the  central  ones.  Another  example  of  connectivity  is  shown  in  Figure  10,  which 
presents  complete  SPONN  outputs  for  complete  and  partial  input  images  of  a  resolution  chart. 

Figure  1 1  illustrates  the  elimination  of  Bragg  degeneracy.  The  steady-state  phase-conjugate 
output  for  a  1024-neuron  input  array  on  the  LCLV  is  shown  in  the  middle  photograph.  When  the 
entire  array  was  shifted  half  a  period  in  any  direction  by  moving  the  data  in  the  image  proce:  or 
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Figure  11. 


18414-1 R1 


(a) 


(b) 


(c) 


Experimental  demonstration  of  crosstalk  reduction  in  SPONN.  (a)  Input  training 
pattern,  (b)  SPONN  output  for  input  a.  (c)  SPONN  output  for  input  a  shifted  by 
half  an  array  period. 
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Figure  11.  Experimental  demonstration  of  crosstalk  reduction  in  SPONN.  (a)  Input  training 
pattern,  (b)  SPONN  output  for  input  a.  (c)  SPONN  output  for  input  a  shifted  by 
half  an  array  period. 
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frame  memory,  the  output  disappeared  immediately,  demonstrating  crosstalk  suppression  without 
subsampling  of  the  SLM.  The  output  reappeared  when  the  array  was  shifted  a  full  period. 

Figure  12  illustrates  selective  grating  weight  erasure  by  shifting  the  phase  of  a  single  neuron 
on  the  LCLV.  The  phase  of  the  indicated  optical  neuron  was  shifted  by  k  without  affecting  its 
amplitude  by  modifying  the  optical  amplitude  versus  applied  voltage  transfer  curve.  This  was  done 
by  rotating  the  LCLV  relative  to  the  input  polarization  which  resulted  in  a  non-monotonic  transfer 
curve.  By  adjusting  the  operating  parameters  two  operating  points  with  the  same  intensity  but 
phase  shifted  could  be  defined.  Computer  input  via  the  ert  which  addresses  the  LCLV  was  then 
used  to  select  between  the  two  operating  points.  A  complementary  grating  in  the  crystal  could  then 
be  written  which  compensated  the  initial  grating,  implementing  active  coherent  erasure  of  weights 
by  in  effect  adding  a  weight  vector  opposite  in  sign.  In  this  manner  bipolar  weights  can  be 
implemented  in  a  photorefractive  crystal.  A  disadvantage  of  this  approach  to  implementing  bipolar 
weights  is  that  although  it  requires  only  a  single  LCLV,  gray  scale  operation  is  not  possible  since 
independent  control  of  both  phase  and  amplitude  is  not  possible  over  a  continuous  range  of  values. 
This  problem  can  be  avoided  by  using  a  second  LCLV  operated  in  phase-only  mode  and  imaging  it 
onto  the  amplitude/phase  LCLV.  The  phase-only  LCLV  would  then  be  used  to  both  implement 
bipolar  weights  and  to  compensate  for  phase  distortions  in  the  amplitude/phase  LCLV.  In  this  type 
of  coherent  representation  of  bipolar  weights  it  is  necessary  to  measure  the  phase  of  the  PCM 
output  interferometrically  in  order  to  determine  the  sign  of  the  neuron  outputs,  which  may  result  in 
practical  difficulties  due  to  stability  and  alignment  requirements.  Spatial  multiplexing  of  the 
positive  and  negative  parts  of  the  weights  using  strictly  positive  connections  can  also  be  used  to 
represent  bipolar  weights.  This  approach  has  the  advantages  of  requiring  only  a  single  LCLV  and 
not  requiring  coherent  detection,  but  at  the  expense  of  using  two  pixels  to  represent  each  neuron 
rather  than  one.  However,  the  practical  advantages  may  be  worth  the  trade-off  in  neuron  number. 

We  have  investigated  the  effects  of  crystal  position  relative  to  the  focusing  lens  on  SPONN 
connectivity.  When  the  entrance  face  of  the  crystal  is  located  in  the  back  focal  plane  of  the  lens  the 
connectivity  is  global.  As  shown  in  Figure  13,  each  neuron  is  connected  to  almost  all  of  the  other 
neurons  in  the  input  plane.  This  is  perhaps  not  surprising  since  the  region  around  the  Fourier 
plane  contains  the  largest  degree  of  spatial  overlap  between  beams  originating  from  neurons  in  the 
input  plane.  In  our  initial  experiments  we  were  able  to  demonstrate  a  fanout  of  256.  When  the 
crystal  was  moved  a  few  mm  from  the  Fourier  plane  became  more  localized,  with  the  range  of 
connections  greater  in  the  horizontal  direction,  as  illustrated  in  Figure  14.  This  was  probably  due 
to  the  character  of  the  light  distribution  at  the  entrance  face  being  closer  to  an  image  of  the  input 
plane  rather  than  the  Fourier  transform.  The  spatial  overlap  between  neuron  light  beams  was  then 
more  dependent  on  scattering  and  fanning  in  the  crystal  due  to  photorefractive  two-beam  coupi.ng 
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Figure  12.  Experimental  demonstration  of  selective  grating  weight  erasure  in  SPONN  by  phase- 
shifting  an  optical  neuron,  (a)  SPONN  output  for  rectangular  array  input. 

(b)  SPONN  output  after  phase  shifting. 
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Figure  14.  Demonstration  of  localized  connectivity;  crystal  situated  a  few  mm  from  Fourier  plane 
of  input. 
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effects,  and  also  Fresnel  reflections  at  the  sides  of  thev crystal.  Since  fanning  occurs  predominantly 
in  the  horizontal  direction  in  BaTiC>3  in  this  geometry,  it  is  not  surprising  that  the  connectivity  has  a 
larger  range  in  the  horizontal  direction.  The  ability  to  control  the  range  of  connectivity  by  adjusting 
the  position  of  the  crystal  will  be  useful  for  implementing  neural  network  models  with  localized 
connections,  such  as  many  early  vision  models. 

A  nonfundamental  but  possibly  practical  limitation  to  SPONN  connectivity  is  noise. 
Potential  types  and  sources  of  noise  include  laser  temporal  noise,  backscattered  nonconjugate  light 
from  the  PCM,  poor  conjugation  fidelity,  and  detector  noise.  Fixed  spatial  noise  or  poor  SLM 
contrast  can  be  partially  offset  during  the  learning  phase.  We  believe  that  the  detector  signal-to- 
noise  ratio  (SNR)  may  be  the  practical  limitation  for  neuron  fan-in/fan-out.  Commercially  available 
cooled  CCD  or  charge-injection  device  (CID)  detectors  may  be  necessary  to  achieve  the  full 
connectivity  potential  of  SPONN. 

3.5  MAPPING  OF  NEURAL  NETWORK  MODELS 

Abstract  neural  network  models  must  be  somehow  mapped  onto  the  optical  hardware. 
Figure  15  illustrates  the  neural  network  topology  for  a  self-pumped  SPONN.  Figure  15(a) 
shows  a  back-propagation  neural  network  with  a  single  hidden  layer.  The  neuron  plane  on  the 
SLM  is  divided  into  three  regions,  Li,  L2,  and  L3,  which  correspond  to  the  input,  hidden,  and 
output  layers  of  the  neural  network,  respectively.  The  neuron  activation  levels  are  controlled  by 
the  image  processor  (represented  schematically  in  Figure  6).  The  grating  connection  pathways  are 
initialized  by  setting  all  neurons  in  all  layers  fully  on  until  a  steady-state  phase-conjugate  return  is 
observed  on  the  video  detector.  Learning  can  then  proceed. 

First,  as  shown  in  Figure  15(b),  an  exemplar  pattern  is  created  by  the  image  processor  in 
region  Li  while  inputs  to  regions  L2  and  L3  are  turned  off.  The  resultant  light  intensities  that  arise 
from  Li  and  are  detected  in  region  L2  are  then  stored  in  the  image  processor  after  electronic 
thresholding  by  means  of  lookup  tables.  Region  Li  is  then  switched  off,  the  thresholded  output  of 
L2  is  displayed  on  the  SLM,  and  the  resultant  output  intensity  pattern  in  L3  is  detected, 
thresholded,  and  recorded  in  the  image  processor.  An  error  pattern  is  formed  electronically  using 
point-by-point  subtraction  in  the  image  processor.  (Because  only  one  operation  is  required  per 
neuron,  that  step  is  not  computationally  burdensome.)  The  incremental  weight  adjustment  for  each 
layer  in  the  back-propagation  procedure  is  given  by  the  outer  product  of  the  error  signal  and  the 
1  pul  pattern  for  that  layer.1*  As  discussed  previously,  the  incremental  change  in  the  diffraction 
efficiency  of  a  photorefractive  grating  is  given  by  the  outer  product  of  the  writing  beams. 
Therefore,  the  gratings  in  the  PCM  are  adjusted  and  a  single  back-propagation  pass  is  completed 
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Figure  15.  Neural  network  topology  for  self-pumped  SPONN,  with  with  5j  =  B;  -  Bf  formed 
electrically,  (a)  Example  of  supervised  learning,  (b)  Neural  network  model  mapped 
onto  optical  hardware. 
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by  displaying  the  error  pattern  in  Lj,  and  the  input  pattern  in  Ln.i,  then  sequencing  down  through 
the  layers.  Subsequently,  the  next  exemplar  is  displayed  in  Li  and  the  procedure  repeated.  Since 
learning  is  a  nonequilibrium  process,  the  exemplar  integration  time  must  be  less  than  the 
photorefractive  time  constant  to  prevent  the  "forgetting"  of  previous  exemplar  contributions. 

Learning  networks  with  localized  lateral  connections  can  also  be  realized  by  placing  the  PCM 
in  a  slightly  misfocused  image  plane  of  the  SLM  rather  than  in  the  Fourier  plane,  as  was 
demonstrated  experimentally  in  the  previous  section.  Such  an  arrangement  would  be  useful  for 
vision  models  that  use  lateral  inhibition. 

Holographic  gratings  normally  form  symmetric  connections.  Many  neural  network  models, 
such  as  the  well-known  back-propagation  model,  assume  symmetric  weights.  However,  SPONN 
can  also  accept  asymmetric  weights.  As  shown  in  Figure  16(a),  asymmetric  SPONN 
interconnections  in  which  the  forward  weight  is  different  from  the  backward  weight  can  be 
implemented  by  spatially  shifting  the  output  plane  relative  to  the  input  plane  in  the  image  processor. 
That  produces  two  separate  connections  between  pairs  of  neurons,  one  for  the  forward  direction 
and  one  for  the  backward  direction.  Such  asymmetric  weights  would  be  useful  for  neural  network 
models  with  dynamic  feedback. 

Second-order  neural  networks  can  also  be  implemented  within  the  SPONN  framework. 
Higher-order  neural  networks  use  weight  tensors  w,jk...  to  interconnect  products  of  neuron 
activation  levels  xjxjc...  to  outputs  yp 


yi  =  X  X  wijk-xixk- 
j  k 


Such  networks  are  useful  because  a  single  layer  of  such  high  order  neurons  can  be  used  to  solve 
problems  that  are  not  linearly  separable  and  are  therefore  much  more  powerful  than  first  order 
single  layer  networks  such  as  the  Perceptron.  In  addition,  several  types  of  invariance  (such  as 
translation  and  rotation)  can  be  built  into  them  on  an  a  priori  basis.16  A  limitation  of  high-order 
networks  is  the  large  increase  in  the  number  of  weights  as  the  order  is  increased.  The  parallel 
architecture  of  SPONN  can  be  used  to  advantage  in  implementing  a  high-order  neural  network 
optically.  For  example,  a  possible  SPONN  implementation  of  a  second  order  neural  network  is 
illustrated  in  Figure  17.  The  products  xjXk  formed  from  the  neuron  input  layer  activation  levels  are 
formed  optically  by  crossing  two  one-dimensional  modulators  to  form  a  outer-product  of  the 
activation  vector  x  with  itself.  A  third  one  dimensional  SLM  is  used  to  modulate  input  light  with 
the  output  neuron  layer  activation  vector  y.  The  second  order  weighted  connections  are  formed  by 
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Figure  16.  Asymmetric  SPONN  interconnections,  (a)  Single  layer,  (b)  Multiple  layers. 
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Figure  17.  Second  order  optical  interconnects  using  one-dimensional  SLMs  in  an  outer-product 
configuration. 
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focusing  both  the  product  matrix  xTx  and  y  into  the  PCM  and  detecting  the  conjugate  signal.  Since 
separate  weights  are  formed  between  each  pair  of  input  pixels,  a  weight  wjjk  is  formed  between 
each  product  XjXk  and  each  output  neuron  yj.  In  this  way  a  second  order  weighted  tensor  sum 
described  by  the  above  equation  is  formed. 

3.6  RESULTS  OF  PERCEPTRON  LEARNING  EXPERIMENTS 

During  the  1988-1990  contract  period,  we  performed  experiments  implementing  the  concepts 
discussed  above  for  optical  learning  in  SPONN.  Our  first  experiment  was  an  attempted 
implementation  of  the  well-known  Perceptron  learning  algorithm  in  a  self-pumped  SPONN,  using 
the  apparatus  diagramed  in  Figure  6.  The  Perceptron  learning  algorithm  is  a  single-layer  neural 
network  consisting  of  a  set  of  input  neurons  connected  to  a  single  output  neuron.  It  can  classify 
linearly  separable  input  patterns.  We  chose  it  for  our  first  attempt  at  implementation  because  it  is 
the  simplest  neural  network  capable  of  learning  and  adaptation.  The  Perceptron  learning  algorithm 
can  be  summarized  as  follows: 

1 .  Initialize  weights  between  input  and  output  nodes  to  random  values. 

2 .  Enter  pattern  Ajm  and  store  resultant  value  B,  of  output  node. 

3 .  Form  error  signal  8im  =  Bj  -  Bim  where  Bim  is  the  desired  output. 

4 .  Modify  weights  according  to  the  outer  product  of  the  error  signal  and  the  input  pattern: 

Awjj  =  T]8imAjm,  where  Ti  is  the  adjustable  convergence  parameter. 

5 .  Increase  m  by  one  and  return  to  step  2. 


The  loop  is  iterated  until  the  error  8  is  less  than  a  specified  small  numerical  value  e  for  all  training 
patterns. 

In  our  SPONN  implementation,  positive  and  negative  weights  were  represented  with  two 
pixels  per  node  pair,  one  for  the  positive  part  and  c.ie  for  the  negative  part  of  the  weight.  The  two 
pixel  values  were  subtracted  electronically  to  form  the  bipolar  output.  The  training  set  of  input 
patterns  consisted  of  the  two  images  shown  in  Figure  18,  a  truck  and  a  person.  The  neuron  array 
size  was  64x64.  The  bright  square  in  the  central  part  of  each  image  represents  the  desired  state  of 
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Figure  18.  Input  training  patterns  for  Perceptron  learning,  (a)  Input  truck  image,  (b)  Input 
person  image. 
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the  output  node.  As  each  image  was  input  into  the  PCM,  the  output  node  state  was  read  by  the 
frame  grabber  and  compared  with  the  desired  state.  An  error  signal,  used  in  displaying  the 
weights,  was  then  shown  on  the  LCLV  along  with  the  input  pattern.  The  frame  time  was  adjusted 
to  be  shorter  than  the  photorefractive  response  time  to  prevent  the  photorefractive  gratings  from 
being  in  equilibrium  with  the  input  patterns. 

The  result  is  evident  in  Figure  19,  which  shows  the  PCM  optical  output  for  the  two  input 
training  patterns;  the  overlapping  output  patterns  are  due  to  the  nonorthogonality  of  the  input 
patterns.  We  are  currently  investigating  the  reasons  for  the  similarity  of  the  two  outputs.  Even 
though  the  two  input  patterns  have  many  pixels  in  common  one  would  expect  a  greater  difference 
between  the  output  patterns.  Figure  20  presents  photographs  of  the  scattered-light  distributions  in 
the  crystal  taken  from  a  vantage  point  above  the  crystal  for  the  two  input  training  patterns.  The 
photographs  indirectly  show  the  general  locations  of  photorefractive  gratings.  The  two  light 
distributions  are  not  identical,  indicating  that  different  gratings  were  formed  for  the  two  input 
patterns. 

Plots  of  the  total  error  versus  iteration  number  showed  that,  after  about  300  iterations,  the 
error  decreased  to  zero  for  several  iterations,  after  which  it  would  increase  and  then  decrease  again. 
That  behavior  was,  we  believe,  due  to  unintended  grating  decay  caused  by  the  destructive  readout 
of  the  PCM  gratings.  A  zero  value  for  the  error  signal  indicated  that  a  solution  had  been  found  for 
the  weights.  The  weights,  however,  were  subsequently  modified  by  the  readout  process. 

3.7  PERMANENT  STORAGE  OF  WEIGHT  VALUES 

Conventional  hologram-fixing  techniques  in  photorefractive  crystals  involve  heating  and/or 
application  of  an  external  electric  field  in  order  to  transfer  photoinduced  gratings  to  space-charge 
gratings  in  optically  insensitive  levels.17  Initially,  a  hologram  is  written  using  conventional 
photorefraction.  A  Jt-phase-shifted  space-charge  pattern  that  compensates  for  the  photorefractive 
hologram  is  then  induced  by  heating  the  crystal  until  ionic  charge  can  move  and  cancel  the  space 
charge  arising  from  the  trapped  carriers.  Reducing  the  temperature  immobilizes  the  ions  again. 
The  trapped  grating  charges  are  then  acti\  ated  by  flooding  the  crystal  with  light.  Under  an  applied 
or  photovoltaic  field,  the  mobile  charges  drift  and  become  spatially  uniform,  leaving  only  the 
mirror-image  hologram,  which  cannot  be  erased  with  optical  radiation  alone.  The  ionic  hologram 
can  be  erased  by  reheating  the  crystal.  In  some  cases,  externally  applied  electric  fields  can  be  used 
in  place  of  or  in  combination  with  heating  of  the  crystal  to  move  the  ionic  charge.  Researchers  at 
HRL  recently  demonstrated  hologram  fixing  in  Bii2TiO20  using  these  techniques.18 
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(a) 


Figure  19.  Output  patterns  displayed  during  Perceptron  learning,  (a)  For  input  truck  image, 
(b)  For  input  person  image. 
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In  ferroelectric  materials  with  low  coercive  fields  (on  the  order  of  1  kV/cm)  such  as  BaTiC>3 
and  Sri-xBaxNb2C>6  (SBN),  an  alternative  technique  has  been  demonstrated:  electrical  fixing  by 
domain  reversal.19  (The  coercive  field  is  the  critical  applied  electrical  field  required  to  reverse  the 
polarization  of  a  ferroelectric  crystal.)  A  spatial  pattern  of  domains  can  be  produced  by  applying  a 
field  just  below  the  coercive  value  and  opposed  to  the  orientation  of  the  existing  polarization.  If 
that  is  done  after  the  hologram  is  recorded,  the  domain  pattern  can  mirror  the  recorded  hologram. 
Holograms  fixed  using  domain  reversal  cannot  be  erased  optically,  but  application  of  a  strong 
poling  field  will  restore  the  initial  blank  state  in  the  crystal. 

Storage  times  greater  than  an  hour  can  be  obtained  in  BaTiC>3  without  fixing  by  reducing  the 
readout  light  intensity.  Initial  learning  experiments  would  use  dynamic  refreshing  of  the  weights 
instead  of  fixing. 


38 


90TP9577 


SECTION  4 
SUMMARY 

In  this  final  report  for  work  performed  in  the  period  March  1988  to  June  1990  we  have 
described  our  efforts  toward  optical  implementations  of  neural  network  models.  Under  this  effort 
we  have  begun  development  of  SPONN  (Stimulated  Photorefractive  Optical  Neural  Network),  a 
hybrid  optoelectronic  system  for  programmable,  adaptive,  and  fully  parallel  direct  physical 
implementations  of  neural  network  models.  In  SPONN,  neurons  are  implemented  as  two- 
dimensional  arrays  of  pixels  (105  to  106  neurons)  on  a  spatial  light  modulator  which  are 
interconnected  optically  in  the  third  dimension.  The  nonlinear  neuron  activation  functions  are 
implemented  electronically.  Individual  connection  weights  are  stored  optically  as  a  set  of  angularly 
and  spatially  distributed  gratings  generated  by  stimulated  processes  in  a  photorefractive  medium. 
These  dynamic  processes  also  generate  the  phase  conjugate  of  the  input  light  distribution.  This 
approach  greatly  reduces  crosstalk  between  neurons  due  to  Bragg  degeneracy  and  permits 
significant  increases  in  neuron  and  interconnection  storage  capacities  and  throughput  over 
subsampled  optical  neural  network  implementations.  Potential  throughput  rates  are  as  high  as  1011 
interconnections  per  second.  Reduced  system  size  and  complexity  result  from  the  use  of  a  single 
photorefractive  crystal  for  all  optical  tasks,  including  weight  storage  and  beam  routing  by  means  of 
phase  conjugation.  In  addition,  the  phase  conjugation  compensates  for  distortions  in  the  optical 
components.  The  architecture  is  programmable  and  expandable,  and  it  permits  the  implementation 
of  both  fully  and  partially  interconnected  multilayer  neural  network  learning  models  (e.g.,  the  well 
known  back-propagation  model),  including  laterally  connected  models.  Both  globally  and  locally 
connected  neural  network  models  can  be  mapped  onto  the  architecture.  Higher  order  neural 
network  models  in  which  connection  weights  are  tensors  rather  than  matrix  elements  can  also  be 
implemented.  We  have  described  experimental  results  on  SPONN  connectivity,  crosstalk 
suppression,  and  weight  modification  using  the  Perception  learning  algorithm. 
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Abstract 


We  describe  an  optical  interconnection  method  based  on  self-pumped  phase 
conjugate  mirrors  in  which  each  connection  weight  is  distributed  among  many 
angularly  and  spatially  multiplexed  gratings.  This  approach  greatly  reduces 
crosstalk  caused  by  the  conical  Bragg  degeneracy  associated  with  a  single 
grating  and  permits  the  entire  input  plane  to  be  used.  Applications  to  optical 
neural  networks  are  described. 


Optics  is  often  suggested  as  an  alternative  to  electronic  implementations  of  neural 
network  models  because  of  its  inherent  parallelism  and  three-dimensional  connectivity.  The 
global  connectivity  of  optics  is  particularly  appealing  with  regard  to  the  communication 
requirements  of  many  neural  network  models  in  which  each  processing  node  or  "neuron" 
receives  a  weighted  sum  of  the  outputs  of  the  neurons  in  the  preeeeding  layer.  Both  spatial 
light  modulator  (SLM)  -based  and  holographic  approaches  for  storing  the  weights  have  been 
proposed.  Holographic  approaches  based  on  photorefractive  materials  are  attractive  for  the 
implementation  of  large  neural  networks  because  of  the  large  storage  capacity1  and  the 
capability  for  the  adjustment  of  all  inter-layer  weights  in  parallel.  To  the  best  of  our 
knowledge,  all  previous  holographic  proposals  have  utilized  one  photorefractive  grating  to 
store  each  connection  weight. 

A  limitation  of  the  single  grating  per  weight  approach  is  that  even  if  the  gratings  are 
formed  in  a  thick  medium  with  high  Bragg  selectivity,  reading  beams  other  than  the  pair  that 
originally  wrote  the  grating  can  reconstruct  an  output  beam.  For  a  single  grating,  all  incident 
K  vectors  which  lie  on  a  cone  defined  by  the  Bragg  angle  will  read  out  the  grating.  This 
Bragg  degeneracy  cone  results  in  crosstalk  between  neurons  which  is  unacceptable  in  neural 
network  models.  One  approach  which  has  been  suggested  to  avoid  this  problem  is  to  arrange 
the  pixels  on  the  input  and  output  planes  in  special  nonredundant  patterns  such  that  unique 
angles  between  pairs  of  writing  and  reading  beams  are  defined.2  Extraneous  connections  are 
still  formed  but  they  are  made  to  areas  of  the  input/output  planes  which  are  not  allowed  to 
contribute  to  the  final  output.  Although  this  approach  solves  the  gratifig  crosstalk  problem,  it  ■ 
also  results  in  subsampling  of  the  input  SLMs  and  under-utilization  of  the  available  SLM 
space-bandwidth  product.  Specifically,  if  the  SLM  is  capable  of  displaying  neurons  with 
N4  potential  interconnections,  then  the  single  grating  per  weight  approach  can  only 
implement  N 3/2  neurons  and  weights,  provided  the  storage  capacity  of  the 
photorefractive  crystal  is  not  exceeded.  2  The  diffraction  efficiency  is  also  reduced  because 
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where  Xp  and  xs  are  coordinates  along  the  directions  of  propagation  of  Ap  and  As,  np  and  ns 
are  the  refractive  indices  in  those  directions,  k  is  the  optical  wavenumber,  reff  is  the  effective 

electrooptic  coefficient  in  the  direction  of  propagation,  x  is  the  space-charge  field  decay  rate, 
a  is  the  absorption  coefficient,  and  Esc=Ejy(l+EiyEq)  where  Ejy  is  the  diffusion  field  and 
Eq  is  the  limiting  space-charge  field.  It  is  clear  from  the  above  equations  that  the  connection 
weight  between  the  two  amplitudes  Ap  and  As  increases  with  the  space-charge  field  E.  The 
growth  of  E  during  the  formation  of  the  grating  is  in  turn  detemiined  by  the  product  ApAs*  , 

which  matches  the  Hebbian  learning  rule  common  to  many  neural  net  models.  Moreover, 

the  observed  distributions  of  beams  within  a  self-pumped  PCM,  which  are  determined  by  the 
high  coupling  gain  of  BaTiC^,  scattering  centers,  reflections  from  crystal  faces,  and  the 

geometry  of  the  PCM  configuration,4  suggest  that  light  beams  from  the  entire  input  plane 
mix  in  the  crystal,  resulting  in  the  global  interconnection  of  input  pixels  by  a  self-pumped 
PCM,  especially  if  the  PCM  is  in  the  Fourier  plane  of  the  input  spatial  light  modulator.  5>6 
Since  a  beam  from  one  pixel  must  diffract  from  a  large  set  of  spatially  distributed  gratings  in 
order  to  form  the  conjugate  of  a  second  pixel,  the  crosstalk  should  be  low  according  to  the 
arguments  presented  previously. 

We  have  performed  a  series  of  experiments  to  test  these  conjectures  for  the  grating 
selectivity  and  global  connectivity  of  the  SPONN  (Stimulated  Photorefractive-effect  Optical 
Neural  Network)  approach.  In  our  first  set  of  experiments,  we  tested  the  Bragg  selectivity  of 
a  self-pumped  PCM  operating  in  the  internal  loop  geometry.7  The  BaTK>3  crystal  was 

obtained  from  Sanders  Associates.  The  laser  source  was  an  argon  ion  laser  operating  at  514 
nm  which  illuminated  a  fixed  mask  with  a  9x9  square  array  of  pixels  consisting  of  1  mm 
diameter  holes.  The  transmitted  light  was  then  focused  into  the  crystal  using  a  100-mm  focal 
length  lens.  The  crystal  was  located  in  the  Fourier  plane  of  the  mask  in  order  to  maximize 
the  overlap  between  light  beams  from  the  pixels.  The  steady  state  conjugate  return  is  shown 
in  Fig.  2a.  The  mask  was  then  translated  in  a  direction  trar  jverse  to  the  beam  path  by  half  of 
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the  hole  period  in  a  time  span  short  compared  to  the  photorefractive  response  time  of  5  sec, 
which  was  set  by  the  total  power  incident  into  the  crystal  of  1  mw.  The  output  plane 
immediately  after  the  translation  is  shown  in  Fig.  2b.  The  lack  of  any  observable  signal 
despite  the  regular  arrangement  of  pixels  in  the  input  plane  confirms  that  very  little  crosstalk 
due  to  the  Bragg  degeneracy  effect  is  present  in  SPONN.  The  signal-to-noise  ratio  of  the 
CCD  video  camera  was  100:1.  Translating  the  array  by  another  half  period  so  that  the 
original  positions  of  the  holes  were  reproduced  resulted  in  the  immediate  reconstruction  of 
the  conjugate  signal,  verifying  that  the  gratings  had  not  been  erased. 

We  then  replaced  the  fixed  mask  with  a  Hughes  Liquid  Crystal  Light  Valve  (LCLV) 
in  order  to  demonstrate  global  connectivity  and  associative  recall.  In  this  experiment  the 
initial  input  consisted  of  a  16x16  array  of  pixels,  each  of  which  was  randomly  assigned 
values  of  1  or  0.  The  steady  state  conjugate  output  is  shown  in  Fig.  3a.  We  then  switched  to 
an  input  consisting  of  a  single  pixel  by  translating  an  opaque  mask  with  a  single  small 
aperture  in  front  of  the  LCLV.  (By  using  an  opaque  mask  rather  than  simply  turning  off  the 
other  pixels  we  eliminated  extraneous  readout  of  the  PCM  by  background  light  due  to  the 
finite  contrast  ratio  of  the  LCLV.)  The  single-pixel  input  is  shown  in  Fig.  3b  and  the 
resultant  conjugate  output  of  the  entire  input  pattern  is  shown  in  Fig.  3c.  Note  that  weights 
were  formed  between  the  pixel  and  all  of  the  other  active  pixels,  demonstrating  associative 
recall  and  global  connectivity  with  a  fanout  of  1:128.  The  degree  of  fanout  we  could 
demonstrate  was  limited  by  the  sensitivity  of  our  CCD  camera,  not  by  the  PCM.  The  fact 
that  each  pixel  occupied  1/1000  of  the  active  area  of  the  LCLV  suggests  that  a  fanout  of 
1:1000  could  have  been  observed  if  a  sufficiently  sensitive  camera  had  been  available. 

Neural  network  models  can  be  implemented  using  the  multiple-grating  per  weight 
approach.  The  complex  reflectance  of  pixels  on  the  LCLV  would  represent  neuron  activation 
levels.  The  conjugate  return,  consisting  of  the  inputs  to  each  neuron  summed  over  the 
photorefractive  weights,  is  directed  by  a  beam  splitter  into  the  CCD  camera,  the  output  of 
which  is  digitized  and  thresholded  at  video  rates  using  lookup  tables  in  an  image  processor 
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card  in  the  host  computer.  In  the  case  of  feedforward  supervised  learning  networks  such  as 
backpropagation,  error  signals  can  be  calculated  by  the  host  and  displayed  on  the  LCLV.  As 
discussed  above,  weight  changes  follow  a  Hebbian  or  outer-product  learning  rule.  The 
frame  time  of  the  LCLV  would  be  adjusted  to  be  shorter  than  the  photorefractive  response 
time  so  that  the  gratings  are  not  in  equilibrium  with  the  input  light,  since  learning  requires 
that  the  output  be  dependent  on  the  previous  exposure  history.  Bipolar  weights  and  weight 
changes  can  be  implemented  either  by  coherent  detection  and  erasure  or  by  employing 
separate  positive  and  negative  weights.  The  bipolar  outputs  can  be  formed  electronically  by 
subtracting  the  contributions  of  the  two  sets  of  weights.  Mud-layer  neural  networks  can  be 
programmed  in  the  same  system  by  sparially  multiplexing  the  layers  on  the  LCLV  and 
sequencing  through  adjacent  layers. 

In  summary,  we  have  discussed  SPONN,  a  method  for  holographically 
interconnecting  optical  neurons  which  distributes  each  connection  weight  among  a  set  of 
angularly  and  spatially  multiplexed  gratings  generated  in  self-  and  mutually-pumped  phase 
conjugate  mirrors.  We  have  presented  experimental  evidence  of  the  reduced  crosstalk, 
optimum  SLM  space-bandwidth  product  utilization,  and  global  connectivity  of  SPONN,  and 
discussed  an  architecture  for  implementation  of  multi-layered  feedforward  neural  networks. 

This  work  was  supported  in  part  by  the  Air  Force  Office  of  Scientific  Research  and 
the  Defense  Advanced  Research  Projects  Agency.  We  would  like  to  thank  C.  Deanda  for 
skillful  technical  assistance  and  G.  Valley  and  G.  Dunning  for  helpful  discussions. 


Figure  Captions 


Figure  1-A.  Ewald  sphere  momentum- space  diagram  for  Bragg  matching  to  two  gratings  in 

series.  Only  a  single  input/output  wavevector  pair  can  lie  on  the  two  Bragg  cones 
and  satisfy  the  Bragg  conditions  at  both  gratings  simultaneously. 

Figure  2-A.  Demonstration  of  elimination  of  Bragg  degeneracy  and  crosstalk  suppression. 

(a)  Steady  state  conjugate  output,  (b)  Zero  output  observed  after  input  array  was 
shifted  by  half  a  period.  The  conjugate  returned  immediately  after  the  input  array 
was  shifted  by  another  half  period. 

Figure  3-A  Demonstration  of  1 : 128  fanout,  global  connectivity,  and  associative  recall. 

(a)  Steady-state  conjugate  output  for  a  16x16  random  binary'  pattern  input. 

(b)  Partial  input  consisting  of  a  single  pixel.  This  represents  only  1/1000  of  the 
active  area  of  the  LCLV.  (c)  Corresponding  PCM  output  immediately  after  input 
was  switched  to  that  shown  in  (b). 
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met  by  conventional  computer  architectures  which  use  a 
small  number  of  processing  units,  bus-oriented  architec¬ 
tures,  and  address-based  random  access  memory. 

A  more  suitable  memory  approach  for  neural  network 
models  is  associative  memory.  Associative  memories  have 
long  been  a  subject  of  active  research  in  both  optical  and 
electronic  computing.  As  described  above,  this  type  of 
memory  is  fundamentally  different  from  conventional  ran¬ 
dom  access  memory  in  that  no  separate  address  exists  for 
each  stored  entity.  Instead,  the  datum  itself  acts  as  a 
pointer  to  either  itself  (homoassociation)  or  to  other  stored 
data  (heteroassociation).  Data  can  flow  through  the  sys¬ 
tem,  exciting  chains  of  associations  until  a  decision  is 
reached  in  a  global  and  parallel  manner.  Associative 
memories  also  have  error  correction  properties  in  that  a 
complete  undistorted  set  of  data  can  be  retrieved  using  a 
distorted  or  partial  version  of  input  data.  Error  jrrection, 
which  stabilizes  the  flow  of  decision  making  through  the 
neural  network,  is  implemented  using  nonlinearities  and 
feedback.  Many  associative  memory  mathematical  models 
have  been  published  and  simulated  on  serial  electronic 
computers  [1],  However,  it  is  inefficient  to  map  such 
highly-parallel  and  fine-grained  models  onto  single-pro¬ 
cessor  serial  computers.  Ideally,  the  architecture  of  a 
neural  network  computer  should  reflect  the  highly  paral¬ 
lel,  associative,  fine-grained,  and  nonlinear  analog  nature 
of  the  neural  network  models.  In  particular,  it  would  be 
advantageous  to  devote  a  processing  unit  to  each  neuron 
rather  than  multiplex  neurons  among  processing  units. 
One  approach  to  achieving  such  an  architecture  is  to  use 
analog  optical  methods  for  parallel  communication  be¬ 
tween  a  large  number  of  processing  units  represented  by 
planes  of  pixels. 

The  good  match  between  the  parallelism  and  intercon¬ 
nectivity  of  optics  and  the  requirements  of  associative 
memory  paradigms  has  not  gone  unnoticed  over  the  years. 
Gabor,  the  inventor  of  holography,  appreciated  its  asso¬ 
ciative  properties  [9].  Some  of  the  early  experiments  in 
holographic  associations  were  performed  by  Collier  and 
Pennington.  These  efforts,  in  which  a  hologram  was 
formed  from  two  object  wavefronts,  were  termed  “ghost 
image  holography  ”  When  the  hologram  was  subse¬ 
quently  illuminated  with  part  of  wavefront  A,  a  complete 
version  of  wavefront  B  was  reconstructed.  These  holo¬ 
graphic  associative  memories  suffered  from  distortions, 
poor  signal-to-noise  ratio  (SNR),  and  low  storage  capac¬ 
ity.  Later,  page-oriented  holographic  memories  were  de¬ 
veloped  which  used  mechanical  or  acoustooptic  deflection 
of  reference  beams  to  read  out  one  of  many  spatially-sep¬ 
arated  sub'noiograms.  The  selection  of  a  particular  stored 
page  for  readout  was  based  on  the  correlation  of  the  input 
wavefront  with  the  stored  wavefronts. 

The  results  of  recent  research  in  neural  network  models 
has  inspired  workers  in  optics  to  add  gain,  nonlinear  feed¬ 
back,  and  competition  to  holography  and  create  a  new 
class  of  optical  associative  memory.  NHAM  (nonlinear 
holographic  associative  memory).  Phase  conjugation  is 
often  used  to  implement  these  features  of  associative 


memory.  The  performance  of  NHAM-type  associative 
memories  is  potentially  superior  to  linear  correlator  ap¬ 
proaches  because,  in  addition  to  increased  storage  capac¬ 
ity  and  discrimination,  the  nonlinearities  in  NHAM  allow 
it  to  make  decisions  and  choose  between  a  set  of  compet¬ 
ing  possibilities  on  the  basis  of  ambiguous  inputs.  Most 
importantly,  its  conceptual  basis  can  be  expanded  to  in¬ 
clude  optical  implementations  of  neural  network  models. 

In  Section  II,  after  briefly  describing  linear  holographic 
associative  memories,  I  will  discuss  some  of  the  theoret¬ 
ical  aspects  of  a  generic  NHAM.  In  particular,  I  will  de¬ 
scribe  the  relationship  of  NHAM  to  certain  higher  oruer 
correlation  neural  network  models.  Well-known  examples 
of  first-order  correlation  neural  network  models  are  the 
outer-product  models  of  Anderson  [10],  Kohonen  [5],  and 
Hopfield  [11].  Grossberg’s  formulations  [12]  also  contain 
outer-product  terms.  Outer-product  models  are  in  fact 
forms  of  the  well-known  Hebbian  model  of  synaptic 
learning.  Higher  order  correlation  models  are  general¬ 
izations  of  outer-product  models  in  which  the  coupling 
matrix  between  neurons  is  a  tensor.  Section  III  is  devoted 
to  descriptions  of  some  representative  experimental  im¬ 
plementations  of  NHAM’s. 

In  any  review  paper  it  is  necessary  to  limit  the  topic  of 
discussion.  In  keeping  with  the  theme  of  this  special  is¬ 
sue,  I  will  limit  myself  to  nonlinear  optical  holographic 
implementations  of  associative  memory  using  phase  con¬ 
jugation  or  optical  retroreflection.  I  will  not  discuss  ma¬ 
trix-vector  multiplier  optical  implementations  of  associ¬ 
ative  memory  [13],  nor  will  I  discuss  more  general  optical 
neural  network  architectures  capable  of  supervised  or  un¬ 
supervised  learning.  Optical  neural  networks  based  on 
matrix-vector  multiplication  use  spatial  light  modulators 
as  two-dimensional  masks  to  store  the  interconnection 
weights  between  arrays  of  discrete  emitters  and  detectors. 
Multilayer  optical  neural  network  architectures  [14]  based 
on  storing  weights  as  holographic  gratings  in  photorefrac- 
tive  crystals  have  been  proposed  which  are  capable  of  im¬ 
plementing  such  neural  network  paradigms  as  backward 
propagation  and  simulated  annealing.  For  more  informa¬ 
tion  on  these  subjects  the  reader  should  consult  the  ref¬ 
erences.  Finally,  I  wish  to  apologize  in  advance  to  any 
workers  whose  relevant  work  has  inadvertently  not  been 
included  here. 

II.  Optical  Associative  Memories 

In  this  section  I  will  discuss  some  theoretical  issues  re¬ 
lated  to  storage  capacity  which  are  common  to  various 
NHAM  implementations.  However,  it  will  be  instructive 
first  to  briefly  discuss  earlier  work  in  linear  holographic 
associative  memories  in  order  to  establish  basic  princi¬ 
ples.  These  principles  will  provide  a  framework  for  the 
discussion  of  nonlinear  holographic  associative  memories 
which  incorporate  feedback  and  gain  using  phase  conju¬ 
gate  resonator  configurations. 

A.  Linear  Holographic  Associative  Memories 

1)  Ghost  Image  Holography  and  Page-Oriented  Holo¬ 
graphic  Memories.  The  associative  properties  of  holog- 
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raphy  have  been  recognized  ever  since  the  invention  of 
holography  by  Gabor  [15].  Van  Heerden  [16]  predicted 
in  1963  that  a  hologram  would  produce  a  “ghost  image” 
of  a  stored  image  upon  illumination  of  the  hologram  with 
a  fragment  of  the  original  image.  This  was  subsequently 
confirmed  by  Stroke  et  al.  [17],  These  early  ghost  image 
experiments  were  characterized  by  poor  image  quality  and 
signal-to-noise  ratio  (SNR).  The  invention  of  off-axis  hol¬ 
ography  by  Leith  and  Upatnieks  [18]  greatly  improved  the 
SNR  by  angularly  separating  the  desired  signal  term  from 
the  undesired  noise  due  to  self-interference  among  scat¬ 
tered  waves  from  the  original  image.  Pennington  and  Col¬ 
lier  [19]  demonstrated  ghost  image  reconstructions  using 
this  off-axis  approach. 

Ghost  image  holography  can  be  mathematically  de¬ 
scribed  as  follows.  Consider  two  complex  wave  ampli¬ 
tudes  a(x,  y)  and  b(x,  y)  in  a  first  plane  (x,  y).  The  two 
wavefronts  are  allowed  to  propagate  over  a  distance  L  to 
a  second  plane  (w,  v)  where  a  photosensitive  holographic 
plate  is  located.  Assuming  the  transmission  of  the  devel¬ 
oped  plate  is  linearly  proportional  to  the  incident  light  in¬ 
tensity  and  diffraction  within  the  hologram  can  be  ne¬ 
glected  (thin  hologram  approximation),  the  amplitude 
transmission  of  the  plate  will  be  proportional  to 

T(u,  v)  =  |/1(«,  v)  +  B(u,  v)|2 

=  \A\2  +  |fl|2  +  BA  +  BA  (1) 

where  A(u,  v)  and  B(u,  v )  are  the  Fresnel-Kirchhoff 
transforms  of  a(x,  y)  and  b(x,  y),  respectively: 

A(u,  v)  =  +  jja'(x.y) 

•  e-2T,Uu+>v)'XL  dx  dy 

where 

a'(x,  y)  =  a(x,  y)eT,<-,J+-v!)/xt. 

When  the  hologram  is  subsequently  illuminated  with 
wavefront  A,  the  resultant  output  A(u,  v)  7(«,  v)  will 
consist  of  several  terms: 

A{u,  v )  T(u,  v )  =  \A^A  +  +  AAB  +  BAA. 

(2) 

The  first  two  terms  represent  on-axis  noise  terms.  The 
third  term  is  an  off-axis  “twin  wave”  which  is  not  of  in¬ 
terest.  The  last  term  (which  is  angularly  separated  from 
the  other  terms  assuming  a(x,  y)  and  b{x.  y)  were  spa¬ 
tially  separated  in  the  (x,  y)  plane)  represents  the  basis 
for  holographic  associative  memory.  The  analysis  will  be 
simplified  without  loss  of  generality  if  we  assume  the 
spherical  phase  terms  in  a'(x,  y)  and  b'(x,  y)  are  can¬ 
celed  using  lenses  so  that  a'(x,  y)  =  a(x ,  y).  This  cor¬ 
responds  to  forming  Fourier  rather  than  Fresnel  holo¬ 
grams.  If  we  Fourier  transform  the  output  of  the  hologram 
with  a  lens  and  consider  only  the  last  term  in  (2),  the  re- 
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suit  is 

output  =  FT  { BAA  } 

=  b  *  {a  ©  a)  (3) 

where  *  and  o  denote  convolution  and  correlation,  re¬ 
spectively.  The  origin  of  the  associative  ghost  image  is 
now  clear.  The  input  image  a(x,  y),  is  correlated  with 
itself  and  then  convolved  with  the  associated  image  b(x, 
y).  If  the  autocorrelation  of  a(x,  y)  is  sharply  peaked,  the 
convolution  of  a(x,  y)  with  b(x,  y)  results  in  an  output 
closely  resembling  b(x,  y).  Thus  upon  input  of  a(x,  y) 
the  wavefront  b(x,  y)  is  reconstructed,  forming  a  heter¬ 
oassociation.  Since  fragments  of  a(x,  y)  also  form  sharp 
correlation  peaks  when  correlated  with  a(x,  y),  a  com¬ 
plete  version  of  b(x,  y)  is  still  formed  when  a  partial  ver¬ 
sion  of  a(x,  y)  addresses  the  hologram,  although  the  re¬ 
construction  is  of  reduced  resolution.  Leith  and  Upatnieks 
demonstrated  the  marked  improvement  in  image  recon¬ 
struction  quality  possible  by  using  diffuse  illumination. 
This  has  the  effect  of  increasing  the  spatial  frequency  con¬ 
tent  of  a(x,  y)  and  thereby  sharpening  its  autocorrelation, 
which  improves  the  resolution  of  b(x,  y). 

Vander  Lugt  [20]  introduced  the  use  of  off-axis  holog¬ 
raphy  for  matched  filter  recognition  of  objects  by  letting 
b(x,  y)  be  a  delta  function  so  that  B(u,  v )  is  a  tilted  plane 
wave.  If  a  lens  is  now  placed  behind  the  hologram  the 
correlation  of  the  input  image  with  the  stored  image  ap¬ 
pears  in  the  back  focal  plane.  If  the  input  image  matches 
the  stored  image  a  bright  spot  appears  in  the  back  focal 
plane  or  correlation  plane.  Moreover,  the  location  of  this 
spot  corresponds  directly  to  the  location  of  the  matching 
image  in  the  input  plane. 

The  Vander  Lugt  linear  optical  correlator  has  found 
many  applications  in  pattern  recognition,  signal  process¬ 
ing,  and  optical  associative  memories.  One  of  the  earliest 
applications  of  the  optical  correlator  for  optical  associa¬ 
tive  memories  was  in  the  page-oriented  holographic  as¬ 
sociative  memory  (HAM)  [21]  for  digital  computers.  In 
this  application  memory  data  were  stored  in  a  large  num¬ 
ber  of  spatially-multiplexed  holograms.  During  recording 
different  data  planes  or  “pages”  were  recorded  in  each 
hologram  sequentially  by  shifting  a  plane  wave  reference 
from  hologram  to  hologram.  In  the  readout  phase  the  light 
from  the  input  data  page  illuminated  the  entire  set  of  hol¬ 
ograms.  An  associative  search  of  all  of  the  stored  data 
could  be  performed  simultaneously.  A  detector  matrix  de¬ 
termined  the  location  of  the  resultant  correlation  peak 
which  determined  the  location  of  the  hologram  containing 
the  matching  data.  This  information  was  used  to  shift  a 
readout  reference  to  the  proper  hologram  for  readout  of 
the  associated  data.  The  system  could  also  be  used  for 
heteroassociation  by  shifting  the  readout  beam  to  a  holo¬ 
gram  different  from  the  matching  one.  Associations  could 
be  made  by  processing  the  correlation  plane  with  lookup 
tables. 

Such  page-oriented  associative  holographic  memories 
are  capable  of  large  storage  capacities  but  are  limited  in 
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some  respects.  In  particular,  the  systems  are  not  shift  in¬ 
variant.  They  work,  best  if  the  matching  patterns  always 
appear  in  the  same  position.  In  addition,  they  handle  mul¬ 
tiple  associations  serially  because  of  the  mechanical  scan¬ 
ning  of  the  readout  beam.  This  lookup  table  approach 
makes  page-oriented  HAM's  unsuitable  for  implementa¬ 
tions  of  neural  network  model-inspired  associative  mem¬ 
ories.  In  response  to  the  need  for  highly-parallel  architec¬ 
tures  for  neural  network  models,  a  new  class  of  HAM's 
has  been  developed  recently  which  is  also  based  on  the 
optical  correlator.  These  new  devices  also  perform  asso¬ 
ciations  using  correlation  as  a  measure  of  similarity. 
However,  unlike  page-oriented  HAM’s,  these  nonlinear 
holographic  associative  memories  (NHAM’s)  use  nonlin¬ 
ear  gain  and  feedback  provided  by  phase  conjugation  to 
implement  competition  between  stored  memories.  This 
competition  is  used  to  perform  associations  with  error 
correction  and  improved  SNR  on  multiple  inputs  in  par¬ 
allel. 

B.  Nonlinear  Holographic  Associative  Memories 

I )  General  Description.  Both  ghost  image  holography 
and  Vander  Lugt  (matched  filter)  correlators  are  forms 
of  optical  associative  memories  in  that  they  return  one 
image  when  addressed  with  another.  Ghost  image  holog¬ 
raphy.  however,  suffers  from  poor  storage  capacity  and 
SNR  due  to  distortions  arising  from  the  correlation-con¬ 
volution  operations  described  in  the  previous  section. 
Spatial  multiplexing  cannot  be  used  to  improve  the  SNR 
if  all  stored  images  or  “objects”  are  to  be  recalled  in  the 
same  position,  which  results  in  the  superposition  of  cross¬ 
correlation  noise  in  the  output  plane.  This  superposition 
further  reduces  the  SNR  and  the  storage  capacity.  The 
Vander  Lugt  correlator,  on  the  other  hand,  has  good  SNR 
due  to  its  large  processing  gain.  However,  it  is  not  very 
useful  as  an  associative  memory  because  it  maps  input 
objects  into  autocorrelation  peaks  in  the  output  plane  in¬ 
stead  of  associating  one  optical  image  or  object  with  an¬ 
other. 

The  NHAM  is  an  optical  associative  memory  which 
combines  the  fully -parallel  image-to-image  heteroasso- 
ciative  capabilities  of  ghost  image  holography  with  the 
high  SNR,  processing  gain,  and  storage  capacity  of 
thresholded  Vander  Lugt  correlators.  In  addition,  nonlin- 
earities  allow  an  NHAM  to  select  a  particular  stored 
memory  over  all  others  on  the  basis  of  incomplete  input 
data.  A  schematic  diagram  of  a  representative  NHAM 
system  is  shown  in  Fig.  1.  The  heart  of  the  system  is  a 
hologram  in  which  Fourier  transforms  of  objects  a"  are 
recorded  sequentially  using  angularly  multiplexed  refer¬ 
ence  beams  b'" .  as  shown  in  Fig.  1.  For  readout  of  the 
NHAM.  phase  conjugate  mirrors  (PCM's)  or  other  means 
of  forming  retroreflected  time-reversed  beams  are  posi¬ 
tioned  on  both  sides  of  the  hologram,  forming  a  phase 
conjugate  resonator.  The  hologram  divides  the  resonator 
into  the  object  and  reference  legs.  When  a  partial  or  dis¬ 
torted  version  of  object  mo(a""‘)  addresses  the  hologram 
via  the  beamsplitter,  a  set  of  partially-reconstructed  ref- 
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Fig  1  Recording  and  readout  of  objects  in  reference-based  NHAM 

erence  beams  ( bm" )  is  generated.  Each  reconstructed  ref¬ 
erence  beam  is  convolved  with  the  correlation  of  the  input 
object  with  the  stored  object  associated  with  that  partic¬ 
ular  reference  beam.  This  part  of  the  system  is  identical 
to  a  matched  filter  Vander  Lugt  correlator.  The  distorted 
reconstructed  reference  beams  are  phase  conjugated  by  the 
reference  leg  PCM  and  retrace  their  paths  to  the  holo¬ 
gram.  These  beams  then  reconstruct  the  complete  stored 
objects.  The  reconstructed  objects  are  phase  conjugated 
by  the  object  leg  PCM  and  the  process  is  iterated  until  the 
system  settles  into  a  self-consistent  solution  or  eigen¬ 
mode.  assuming  the  gain  of  the  PCM’s  is  sufficient  for 
oscillation.  In  the  absence  of  the  hologram  the  phase  con¬ 
jugate  resonator  can  support  a  continuum  of  different  res¬ 
onator  modes.  The  eigenmodes  of  the  NHAM  resonator 
are  defined  by  the  stored  wavefronts  in  the  hologram. 

An  important  common  feature  of  NHAM’s  is  nonlin¬ 
earity.  Without  it  NHAM's  could  not  “choose"  a  partic¬ 
ular  memory  over  all  others  and  the  output  would  be  a 
linear  superposition  of  multiple  recalled  memories.  If  the 
stored  objects  are  considered  to  be  vectors  in  state  space, 
then  NHAM  nonlinearities  form  regions  of  attraction 
around  the  stored  object  vectors  in  a  manner  analogous  to 
neural  network  formulations  of  associative  memory .  The 
nonlinear  response  and  multiple  stable  states  of  the 
NHAM  allow  selections  between  patterns  to  be  made  on 
the  basis  of  incomplete  data  since  gain  will  exceed  loss 
only  for  the  stored  pattern  with  the  largest  overlap  with 
the  input  pattern.  Nonlinearities  also  improve  the  SNR 
and  storage  capacity  over  ghost  image  holography  or  lin¬ 
ear  matched  filter  correlators.  The  output  association  is 
available  in  two  forms  depending  on  where  the  output  is 
coupled  out.  The  reference  side  of  the  NHAM  is  essen¬ 
tially  a  Vander  Lugt  correlator  where  a  correlation  peak 
marks  the  location  of  the  recognized  object  in  the  input 
plane.  In  the  object  leg  an  undistorted  version  of  the  as¬ 
sociated  stored  object  is  superimposed  over  the  partial  in¬ 
put  object.  The  output  can  be  separated  from  the  input 
with  a  beamsplitter. 

2)  Storage  Capacity.  In  this  section  I  will  discuss  the 
effects  of  nonlinearities  in  the  reference  leg  on  the  SNR 
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Fig  2.  Block  diagram  of  ileralive  model  of  NH AM. 


and  storage  capacity  of  NHAM’s.  The  resonator  nature  of 
the  NHAM  is  illustrated  in  the  block  diagram  of  Fig.  2. 
Assuming  thin  Fourier  transform  holograms  and  using  the 
same  approach  as  in  (l)-(3),  an  iterative  equation  can  be 
written  for  the  NHAM  output: 

©  am)  *bm^  o  bm'*am'  , 

(4) 

where  af  is  the  amplitude  in  the  object  leg  after  the  mh 
round-trip  through  fhe  resonator.  am  are  the  objects  stored 
in  the  hologram,  bm  are  the  reference  beams  used  in  hol¬ 
ographically  recording  the  objects,  /(  )  represents  the 
nonlinear  reflectivity  of  the  reference  leg,  and  F(  )  rep¬ 
resents  an  output  plane  point  nonlinearity.  The  input 
"seed”  ao°  for  the  resonator  is  a  partial  or  distorted  ver¬ 
sion  of  object  mO.  The  output  in  the  nth  round-trip  con¬ 
sists  of  a  double  sum  of  cascaded  correlations-convolu- 
tions.  The  double  sum  over  the  object  indexes  m  and  m' 
is  due  to  the  double-pass  through  the  hologram.  Assuming 
the  reference  beams  are  angularly  multiplexed  plane 
waves,  the  bm  functions  are  spatially  displaced  delta  func¬ 
tions: 

b”  =  6(x-xm).  (5) 

(It  should  be  noted  that  although  all  the  calculations  here 
are  being  done  in  one  dimension,  these  results  are  readily 
extended  to  the  two-dimensional  images  in  NHAM  asso¬ 
ciative  memories.)  Substituting  (5)  in  (4)  results  in  the 
following  iterative  equation  for  the  object  leg  optical  am¬ 
plitude  distribution  after  the  first  round-trip  through  the 
resonator: 


af(x)  =  f\ S  Z/[C:-,(x  -  *„  +  xm.)\  *  a"\x)  \ 

^m’  m  j 


where 

Cm  (  v\  _  m 

n  (A)  =  a„. I  ©  a 


C„(x)  is  the  correlation  between  the  stored  object  m  and 
the  resonator  amplitude  distribution  in  the  mh  iteration.  I 
have  assumed  that  the  angular  separation  between  refer¬ 
ence  beams  is  large  enough  to  separate  the  correlation- 


convolution  terms  in  the  reference  leg,  which  permits  me 
to  disregard  cross  terms  due  to  the  nonlinear  reflectivity 
/(  ).  To  facilitate  the  analysis  and  allow  direct  compari¬ 
sons  with  some  outer-product  type  neural  network  models 
of  associative  memory,  I  will  assume  objects  consist  of 
/'/-dimensional  vectors  whose  components  assume  values 
of  +1  or  - 1 .  (Objects  consisting  of  analog  vectors  can 
also  be  stored  in  NHAM’s.  This  binary  representation  is 
used  to  simplify  the  analysis.)  I  will  further  assume  that 
the  reference  functions  bm  are  uniformly  distributed  and 
equally  spaced  in  the  object  plane.  If  these  spacings  are 
wider  than  the  widths  of  the  objects,  then  by  placing  an 
aperture  over  the  output  plane  only  reconstructions  for 
which  m  =  m'  in  (6)  are  retained.  This  aperture  prevents 
ambiguities  in  the  output  plane  which  would  otherwise 
occur  if  a  thin  hologram  is  used.  The  reference  beam  re¬ 
constructs  not  only  object  mO  centered  on  the  input  object 
but  also  all  other  stored  objects.  The  aperture  blocks  these 
other  objects  since  they  are  displaced  from  object  mO,  but 
at  the  cost  of  a  reduced  amount  of  shift  invariance  in  the 
field  of  view  (FOV).  As  more  objects  are  stored  the 
amount  of  unambiguous  shift  invariance  is  decreased  pro¬ 
portionally.  The  hologram  can  store  only  a  single  object 
with  shift  invariance  over  the  entire  FOV.  (Another  lim¬ 
itation  on  the  shift  invariance  and  storage  capacity  is  that 
the  total  space-bandwidth  product  of  all  shifted  versions 
of  the  stored  objects  cannot  exceed  the  space-bandwidth 
product  of  the  hologram  [22],  [23].) 

An  estimate  can  be  made  of  this  FOV  tradeoff  between 
the  number  of  stored  objects  and  degree  of  shift  invari¬ 
ance.  For  example,  assuming  a  Fourier  transform  lens  fo¬ 
cal  length  of  F ,  a  shift  invariance  of  .V  implies  an  angular 
spectrum  range  at  the  hologram  of 

d  =  X/F.  (7) 

If  we  further  assume  the  hologram  has  good  diffraction 
efficiency  for  a  range  $  of  reference-object  beam  angles, 
then  the  number  of  objects  that  can  be  stored  with  shift 
invariance  X  in  two  dimensions  is  limited  by  FOV  ambi¬ 
guity  to 

M  =  (A<*>/0)2.  (8) 

For  parameter  values  of  F  =  500  mm,  X  -  10  mm.  $  = 
30°,  and  out-of-plane  reference  beams,  the  maximum 
number  of  stored  objects  limited  by  FOV  ambiguity  is  M 
=  680.  The  FOV  ambiguity  issue  is  moot  for  volume  hol¬ 
ograms  because  Bragg  selectivity  prevents  reconstruction 
of  beams  angularly  shifted  in  the  same  plane  as  the  orig¬ 
inal  reference  and  object  beams.  (The  selectivity  is  much 
less,  however,  for  out-of-plane  shifts  [24].) 

Assuming  an  aperture  which  eliminates  the  ambiguous 
reconstruction,  only  terms  for  which  m  =  m'  are  retained 
in  (6): 

af(i)  =  F\Z  T,f{C:Up))am{i  ~  P) I  (9) 

m  p 
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The  SNR  for  NHAM  in  the  first  iteration  or  at  the  end 
of  the  first  round-trip  [before  pointwise  nonlinear  trans¬ 
formation  in  the  object  domain  by  F(  )  ]  can  be  calculated 
from 


|/[cff°(0)] 

.  2  1 
V*0 

r[cj°(P)] 

|2+  £ 

■  m  *  m( 

Sl- 
>  P  1 

f[CJ(p)]|! 

(10) 


I  will  now  make  some  assumptions  concerning  the  statis¬ 
tical  properties  of  the  stored  objects  in  order  to  calculate 
the  Cm(p)  cross  correlation  coefficients.  In  particular,  I 
will  assume  the  objects  are  random  and  not  orthogonal- 
ized  so  that  the  statistics  can  be  described  by  a  balanced 
binary  phase  diffuser  model  [25]: 

Co{p)  =  N6{p)  +  V2(tf  -  |p|)/3,  ifm  =  m0 
=  V2 {N  -  M)/3.  ifm*m0 

(ID 

where  N  is  the  size  of  the  stored  object  vectors.  Assume 
further  that  f(x),  the  point  nonlinearity  in  the  reference 
or  correlation  domain,  has  the  form/(.r)  =  xn.  (The  ef¬ 
fects  of  arbitrary  nonlinearities  can  then  be  estimated  by 
using  a  polynomial  approximation  and  superposition.) 
Substituting  these  expressions  in  (10)  and  performing  the 
summations  over  m  and  p  results  in  the  following  expres¬ 
sion  for  the  SNR: 

SNRNham  =  0(3/2  )n,'y/(n  +  l)/2-j=-, 

N  »  1  (12) 

where  /3  is  the  fraction  of  am0  used  as  the  input  object  and 
M  is  the  number  of  stored  vectors.  A  heuristic  estimate 
for  the  storage  capacity  can  be  obtained  by  solving  (12) 
for  M  in  terms  of  N.  Assuming  a  minimum  SNR  required 
for  successful  associative  recall,  M  should  be  proportional 
to  N"~ '.  The  proportionality  constant  is  given  by  the  min¬ 
imum  SNR  required  by  the  particular  NHAM  system  for 
successful  convergence.  Therefore,  within  limits  set  by 
the  available  dynamic  range,  we  can  conclude  that  the 
storage  capacity  of  an  NHAM  can  be  increased  by  in¬ 
creasing  the  nonlinearity  in  the  correlation  domain.  A 
similar  analysis  for  the  outer-product  associative  memory 
results  in 

SNRoutc,producl  =  1 2/3  -  \\4WJm  (13) 

which,  using  the  above  SNR  arguments,  implies  that  M, 
the  number  of  stored  objects,  should  be  linearly  propor¬ 
tional  to  N.  The  storage  capacity  of  an  outer-product 
model  was  reported  by  Hopfield  as  linear  in  N  based  on 
empirical  evidence  for  small  N  values  [12].  Using  a  hy¬ 
perplane  counting  argument,  Abu-Mostafa  and  St.  Jacques 
have  shown  that  the  capacity  of  the  outer-product  model 


Fig.  3.  Comparison  of  storage  capacity  of  reference-based  NHAM  with 
the  outer-product  model  for  nonlinearities  in  correlation  domain  of  form 
f(x)  =  x".  Error-free  input  objects  assumed.  (After  [31].) 

is  bounded  from  above  by  N  [26].  McEiiece  et  al.  [27], 
Bruce  et  al.  [28],  and  Weisbuch  and  Fogelman  [29]  ap¬ 
plied  techniques  from  coding  theory  to  the  outer-product 
model  and  showed  that  for  random  objects  the  maximum 
asymptotic  value  of  M  for  which  all  objects  can  be  re¬ 
covered  exactly  is  A//(4  log  N)asN  approaches  infinity. 
Their  results  also  implied  that  if  a  specified  nonzero  error 
rate  in  recall  can  be  tolerated,  the  asymptotic  storage  ca¬ 
pacity  becomes  linear  in  N.  Gardner  [30]  extended  these 
results  to  a  higher  order  generalization  of  the  outer-prod¬ 
uct  model.  Owechko  et  al.  [31]  performed  computer  sim¬ 
ulations  of  the  storage  capacity  of  the  outer-product  and 
NHAM  models.  The  results  are  shown  in  Fig.  3  for  the 
number  of  vectors  stored  as  a  function  of  N  for  power  law 
nonlinearities  with  n  =  2,3,  and  4.  Each  curve  was  aver¬ 
aged  over  many  runs  using  randomly  selected  vectors. 
Because  the  input  vectors  did  not  contain  any  errors,  the 
simulations  in  effect  tested  the  stored  vectors  for  being 
eigenvectors  of  the  system. 

Combining  (12)  and  (13)  and  solving  for  the  nonlin¬ 
earity  n  =  n0 p  which  results  in  an  NHAM  storage  capacity 
equal  to  the  outer-product  model  gives  nop  approximately 
equal  to  2.  This  is  verified  in  Fig.  3  as  the  slopes  of  the 
M  versus  N  curves  plotted  on  logarithmic  scales  equal 
n  -  1  and  the  capacity  for  n  =  2  is  approximately  equal 
to  the  outer-product  model.  Although  the  above  SNR  ar¬ 
guments  and  simulation  results  demonstrate  the  improve¬ 
ment  in  storage  capacity  caused  by  nonlinearities  in  the 
correlation  domain,  the  heuristic  nature  of  the  arguments 
are  evident  in  light  of  the  asymptotic  results  of  McEiiece 
et  al.  for  the  outer-product  model  capacity. 

The  improvement  in  storage  capacity  of  an  NHAM  over 
an  outer-product  associative  memory  is  due  to  its  close 
analogy  to  certain  higher  order  discriminant  models.  One 
form  of  the  nth  order  discriminant  model  can  be  defined 
as  a  generalization  of  the  outer-product  associative  mem¬ 
ory  model  in  which  the  WtJ  weight  matrix  is  a  tensor  of 
order  n  +  1: 
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where  the  Xm  are  one-dimensional  stored  vectors 


OWECKHO:  NONLINEAR  HOLOGRAPHIC  ASSOCIATIVE  MEMORIES 


output  is  calculated  by  forming  a  tensor  product: 


^•output  _  p. 


2  2,  •  •  • ,  2  w, 


j  i  r. 


m  r- 


yinpul 


using  nonlinear  gain  saturation.  Some  specific  implemen¬ 
tations  of  these  categories  of  NHAM,  which  vary  mostly 
in  the  nature  of  the  feedback  and  thresholding  mecha¬ 
nisms,  will  now  be  described. 


yinpul  #  ^  t  j^input 


=  F 


(x™0.  x,npu,)"x';’°  + 


2  (xin,  x'^yx'," 

m  *  inO 


05) 


The  tensor  generalization  greatly  increases  the  number 
of  degrees  of  freedom  which  results  in  dramatically  in¬ 
creased  storage  capacities  [32]-[34].  Comparing  (15)  and 
(9)  shows  that  a  power  law  nonlinearity  of  degree  n  in  the 
correlation  plane  of  an  NHAM  is  analogous  to  an  nth  or¬ 
der  discriminant  function.  A  polynomial  nonlinearity  in 
the  correlation  plane  is  analogous  to  a  weighted  sum  of 
higher  order  discriminant  functions.  They  are  not  com¬ 
pletely  equivalent  because  inner  products  are  used  in  the 
outer  product  model  as  opposed  to  correlation  in  the 
NHAM.  This  results  in  additional  noise  terms  in  the 
NHAM  arising  from  the  wings  of  the  correlation  function. 

Other  sources  of  noise  will  also  be  present  in  practical 
NHAM  systems.  These  noise  sources  include  dielectric 
inhomogeneities  in  the  holographic  medium  and  detector 
noise  [35],  For  photorefractive  materials,  erasure  of  pre¬ 
viously  recorded  holograms  during  recording  and  subse¬ 
quent  readout  may  also  limit  the  storage  capacity  [36], 
although  fixing  techniques  [37]  may  remove  the  later  lim¬ 
itation.  These  factors  will  reduce  the  storage  capacity  from 
the  theoretical  diffraction-limited  estimates  of  Van  Heer- 
den.  Accurate  estimates  will  be  specific  to  the  particular 
system  being  considered. 


III.  Implementations 
A.  NHAM  Categories 

NHAM  implementations  can  be  categorized  based  on 
the  resonator  geometry  and  the  method  used  for  generat¬ 
ing  the  reference  beams  used  in  recording  the  holograms. 
They  can  be  further  differentiated  by  the  form  and  imple¬ 
mentation  of  the  nonlinearities.  Most  of  the  systems  re¬ 
ported  to  date  have  been  based  on  a  double  PCM  reso¬ 
nator  configuration  similar  to  the  one  described  above  in 
which  a  separate  independent  reference  beam  is  associ¬ 
ated  with  each  object  beam.  The  reference  beams  are  gen¬ 
erally  plane  waves,  so  that  the  reconstruction  quality  can 
be  controlled  by  adjusting  the  nonlinearities  in  the  corre¬ 
lation  domain  without  loss  of  gray  scale  fidelity  in  the 
object.  (In  general,  most  NHAM  implementations  to  date 
have  not  relied  on  the  nonlinearity  of  the  PCM’s,  instead 
various  external  nonlinear  mechanisms  have  been  used.) 
Ring  resonator  geometries  have  also  been  proposed  and 
demonstrated  which  derive  the  reference  beam  from  the 
object  beam.  Although  such  systems  lack  some  of  the  dis¬ 
crimination  obtainable  using  separate  reference  beams, 
they  do  incorporate  competition  between  stored  modes 


B.  Multipass  NHAM  Configurations 

1)  Phase  Conjugate  Mirrors:  Soffer  et  al.  [38],  [39] 
have  demonstrated  NHAM's  which  use  thin  thermoplastic 
Fourier  transform  holograms  as  the  storage  medium.  Ad¬ 
vantages  of  this  approach  include  shift-invariance  and  the 
capability  of  programming  heteroassociations  by  manip¬ 
ulating  the  correlation  plane.  A  disadvantage  is  the  lack 
of  Bragg  selectivity  which  results  in  low  information  stor¬ 
age  capacity  compared  to  volume  holograms. 

This  NHAM  structure  is  identical  to  Fig.  1  and  the  the¬ 
ory  of  the  previous  section  is  applicable  without  modifi¬ 
cation.  lit  experimental  demonstrations  a  single  iteration 
nonresonating  configuration  was  used,  as  shown  in  Fig. 
4.  Two  objects  were  recorded  sequentially  in  the  holo¬ 
gram,  each  with  its  respective  angularly-shifted  plane 
wave  reference  beam.  The  hologram  was  recorded  at 
5 14.5  nm  using  a  Newport  Corporation  thermoplastic  hol¬ 
ographic  camera.  A  partial  version  of  one  of  the  stored 
objects  was  then  used  to  address  the  hologram.  A  lens  was 
used  to  focus  the  correlation  plane  output  of  the  hologram 
into  a  PCM  based  on  degenerate  four-wave  mixing 
(DFWM)  in  BaTiOj.  Typical  parameters  for  PCM  oper¬ 
ation  were  wavelength  514.5  nm;  forward  and  backward 
pump  fluxes  3.3  and  11.5  W/cm2,  respectively:  internal 
pump-probe  angle  26° ;  and  internal  angle  of  grating  k 
vector  to  c  axis  13°.  The  output  of  the  hologram  acted  as 
a  probe  for  the  DFWM  system,  generating  an  amplified 
phase  conjugate  of  the  correlation  plane.  The  conjugated 
backward  propagating  beam  illuminated  the  hologram, 
recreating  a  complete  version  of  the  stored  object.  Ex¬ 
amples  of  stored  objects,  partial  inputs,  and  reconstructed 
outputs  are  shown  in  Figs.  5  and  6.  The  capability  of  a 
reference  based  NHAM  to  handle  gray  scale  objects  is 
demonstrated  in  one  of  the  examples.  In  this  series  of  ex¬ 
periments  a  single  pass  nonresonator  configuration  was 
used  and  the  PCM  was  operated  in  the  linear  reflectivity 
regime  of  DFWM.  Thresholding,  whether  due  to  com¬ 
petition  between  resonator  modes  caused  by  gain  satura¬ 
tion  or  to  nonlinearities  in  the  PCM  reflectivity,  was  not 
demonstrated  in  this  system.  The  quality  of  the  recon¬ 
structions  using  what  was  basically  a  linear  associative 
memory  was  due  to  the  coding  of  the  objects  using  high 
spatial  frequency  diffusers  in  contact  with  the  objects.  The 
sharpened  autocorrelation  peaks  of  the  diffusers  improved 
the  resolution  of  the  objects. 

2)  Electronic  Lookup  Tables:  In  order  to  address  the 
issues  of  implementing  controllable  arbitrary  nonlineari¬ 
ties  in  the  correlation  plane  and  making  diffusers  unnec¬ 
essary,  increasing  the  optical  gain  in  order  to  achieve  res¬ 
onator  oscillation,  and  facilitating  the  interfacing  of  an 
NHAM  to  an  electronic  host  computer,  Owechko  [40]  sug¬ 
gested  and  implemented  a  hybrid  optical-electronic 
NHAM  based  on  liquid  crystal  light  valves  (LCLV’s).  A 
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Fig.  4  Configuration  u>ed  in  associative  memory  experiments  by  Softer 
ei  nl.  (Alter  |39|.) 


(a!  (b)  (c) 

IMAGE  STORED  INCOMPLETE  ASSOCIATED 

IN  MEMORY  INPUT  IMAGE  OUTPUT  IMAGE 
Fig.  5.  Reconstruction  of  gray  scale  image  from  partial  input-  (a)  image 
stored  in  hologram:  <b>  partial  input  image;  <c)  associated  output  image 
(inversion  due  to  mirror  reflection!.  (After  [39|.) 
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Fig.  6  Revunstruaion  ul  complete  objects  from  partial  input  objects  for 
multiple  stored  objects  (a)  images  stored  in  hologram,  (b)  superimposed 
images  shown  as  recorded;  (c)  partial  input  images:  (d)  corresponding 
output  images.  (Alter  |39|.) 
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Fig.  7.  Block  diagram  of  hybrid  NHAM.  (A '  r  |40|.) 


Fig.  8.  Schematic  diagram  of  hybrid  NHAM.  (After  (40). ) 


block  diagram  of  the  hybrid  NHAM  is  shown  in  Fig.  7 
and  a  detailed  schematic  in  Fig.  8.  The  basic  principles 
of  the  hybrid  NHAM  are  the  same  as  described  for  the 
all-optical  NHAM.  The  implementation  of  the  input  and 
feedback  mechanisms  are.  however,  quite  different.  In¬ 
stead  of  using  DFWM  in  BaTi03  to  create  true  phase  con¬ 
jugates  of  the  reconstructed  reference  and  object  beams, 
a  pseudoconjugation  system  using  video  detectors  and 
CRT-addressed  LCLV’s  was  used.  Referring  to  Fig.  8. 
the  partial  input  image  is  focused  onto  an  object  loop  video 
detector  and  transfers  the  image  to  a  CRT-addressed 
LCLV.  The  optical  output  of  the  object  loop  LCLV  ad¬ 
dresses  the  thermoplastic  hologram  and  reconstructs  the 
correlation  plane  which  is  focused  on  the  reference  loop 
video  detector.  A  one-to-one  mapping  is  performed  be¬ 
tween  the  detector  and  the  output  of  reference  loop  LCLV. 
which  is  positioned  in  the  back  focal  plane  of  the  corre¬ 
lation  lens.  Thresholded  correlation  peaks  on  the  refer¬ 
ence  loop  LCLV  are  converted  into  backward  propagating 
plane  wave  reference  beams  by  the  correlation  lens.  These 
beams  address  the  hologram,  reconstructing  recorded  ob¬ 
jects  which  are  in  turn  focused  on  the  object  loop  video 
detector,  closing  the  resonator  loop.  The  combined  gain 
of  the  detector/CRT/LCLV  loops  is  more  than  sufficient 
to  overcome  the  optical  losses,  resulting  in  a  feedback 
system.  The  advantage  of  this  approach  is  that  general 
nonlinear  feedback  functions  can  be  easily  programmed. 
Between  the  reference  loop  video  detector  and  the  LCLV, 
the  correlation  plane  is  nonlinearly  processed  in  elec¬ 
tronic  form  using  digital  lookup  tables  in  a  PC  board  level 
image  processor.  The  image  processor  can  also  be  used 
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to  program  heteroassociations  or  multilayer  optical  neural 
networks  by  shuffling  subregions  of  the  correlation  plane 
[32]. 

In  preliminary  experiments  using  a  20  mW  HeNe  laser 
at  632.8  nm.  a  single  object-reference  pair  was  recorded 
in  the  hologram.  This  demonstration  showed  that  the  hy¬ 
brid  resonator  has  at  least  one  stable  state  which  can  be 
reached  only  if  the  injected  signal  is  sufficiently  similar 
to  the  stored  image.  As  shown  in  Fig.  9,  if  more  than  50 
percent  of  the  object  were  injected  into  the  system,  reso¬ 
nance  would  occur  and  the  system  would  latch  onto  the 
stored  object.  The  object  would  continue  to  circulate  in 
the  resonator  after  removal  of  the  input,  demonstrating 
bistability.  Interruption  of  the  circulating  beam  would  re¬ 
turn  the  resonator  to  its  initial  zero  state.  The  hybrid 
NHAM  demonstrated  robustness  in  the  face  of  input  dis¬ 
tortions.  For  example,  if  the  input  object  was  rotated  by 
up  to  10°,  the  output  would  still  switch  to  the  resonator 
state  consisting  of  a  circulating  undistorted  version  of  the 
stored  object.  The  amount  of  tolerable  distortion  in¬ 
creased  as  the  sharpness  of  the  nonlinearity  in  the  corre¬ 
lation  plane  was  increased.  The  system  would  not  latch 
for  different  input  objects,  indicating  the  resonator  was 
recognizing  the  input  object  and  not  merely  being 
switched  by  stray  scattered  light. 

3)  Pinhole  Array:  Paek  and  Psallis  [41]  have  demon¬ 
strated  two  different  NHAM  systems.  In  the  first  system, 
a  single-pass  passive  system  shown  in  Fig.  10,  a  set  of 
spatially  multiplexed  objects  are  holographically  re¬ 
corded,  all  simultaneously  using  a  single  reference  beam. 
In  other  words,  a  single  ‘‘macro  object”  is  recorded  in 
the  hologram  which  consists  of  many  subregions,  each 
containing  a  single  object.  The  macro  object  and  the  ho¬ 
logram  are  located  in  the  front  and  back  focal  planes  of  a 
lens,  which  results  in  the  formation  of  a  Fourier  transform 
hologram.  During  readout  an  aperture  equal  in  size  and 
shape  to  the  subregions  in  the  macro  object  is  centered  in 
the  input  plane  and  input  objects  are  placed  inside  it,  as 
shown  in  Fig.  11.  (See  discussion  of  FOV  ambiguity  in 
the  previous  section.)  This  approach  is  equivalent  to  se¬ 
quentially  recording  objects  centered  in  the  same  aperture 
but  with  angularly-shifted  plane  wave  reference  beams. 
Both  approaches  divide  the  correlation  plane  into  subre¬ 
gions.  During  readout,  the  presence  of  a  correlation  peak 
in  a  particular  subregion  is  a  unique  label  for  which  the 
stored  object  is  being  recognized.  The  location  of  the  cor¬ 
relation  within  the  subregion  has  a  one-to-one  correspon¬ 
dence  to  the  location  of  the  object  in  the  input  aperture. 

Thresholding  was  implemented  using  a  pinhole  array  in 
contact  with  a  mirror  placed  in  the  back  focai  piane  of  a 
correlation  lens.  The  correlation  lens,  in  turn,  was  posi¬ 
tioned  so  that  its  front  focal  plane  coincided  with  the  Fou¬ 
rier  transform  hologram.  The  correlation  lens  and  minor 
acted  as  a  “cat’s  eye”  pseudo-conjugator  which  retrore- 
flected  the  reconstructed  reference  beams  back  to  the  ho¬ 
logram  for  readout  of  the  hologram.  The  appropriate 
stored  object  was  reconstructed  centered  on  the  input  ap¬ 
erture.  Thepinhole  array  passed  only  the  peaks  of  the  cot- 


input  LATCHED  OUTPUT 


Fig.  9.  Hybrid  NHAM  latched  output  for  partial  input.  (After  140).) 


Fig,  10  Recording  of  multiple  objects  in  a  thin  Fourier  transform  holo¬ 
gram  using  spatial  multiplexing  of  the  objec.s.  (After  |4I|.) 


Fig,  1 1  Schematic  diagram  of  the  pinhole  array-mirror  holographic  as¬ 
sociative  memory  system.  (After  (41).) 


relations,  suppressing  the  sidelobe  noise  and  improving 
the  reconstruction  quality.  Such  an  approach  to  correla¬ 
tion  plane  nonlinearities  has  the  advantage  of  simplicity, 
but  it  also  destroys  the  natural  shift  invariance  of  the  Fou¬ 
rier  transform  hologram.  Shifts  of  the  input  object  within 
the  input  aperture  shifts  the  correlation  peak  as  well.  Since 
the  pinholes  are  spatially  fixed,  no  object  shifts  can  be 
tolerated.  (Paek  and  Psaltis  have  discussed  approaches  for 
restoring  shift  invariance  by  eliminating  the  pinhole  array 
and  using  quadratic  nonlinearities  in  the  correlation  plane 
[e.g.,  n  =  2  in  (15)],  but  have  not  discussed  specific  im¬ 
plementations.)  Their  experimental  results  using  the  pin¬ 
hole  system  are  shown  in  Fig.  12.  Four  objects  were 
stored  in  the  hologram.  The  reconstructed  outputs  and 
their  associated  partial  inputs  are  shown.  The  poor  recon¬ 
struction  quality  may  have  been  due  to  the  relatively  large 
size  of  the  pinholes  (350  microns).  Because  of  the  pas¬ 
sive  nature  of  the  pinhole-mirror  pseudoconjugator  and 
the  resultant  lack  of  gain,  a  resonator  architecture  was  not 
implemented. 
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Fig.  12.  Pinhole  array-mirror  associative  memory:  partial  inputs  (left)  and 
outputs  (right).  (After  (41). ) 


In  order  to  improve  the  reconstruction  quality,  in  their 
second  system  Paek  and  Psaltis  separated  the  functions  of 
identification  and  reconstruction  and  used  a  separate  hol¬ 
ogram  for  each  function.  This  second  NHAM  is  shown  in 
Fig.  13.  A  thresholding  spatial  light  modulator  (micro- 
channel  spatial  light  modulator  or  MSLM)  was  also  added 
in  the  input  path.  Thresholding  the  input  image  [the  F(  ) 
function  in  (9)  ]  can  sharpen  the  correlation  peak  and  im¬ 
prove  the  reconstruction  quality.  The  first  Fourier  trans¬ 
form  lens,  hologram,  correlation  lens,  and  pinhole  array 
combination  is  identical  to  the  thresholded  Vander  Lugt 
correlator  portion  of  their  first  system.  However,  now  in¬ 
stead  of  retroreflecting  the  correlation  peak  back  toward 
the  first  hologram,  it  is  passed  on  to  a  second  correlation 
lens  which  converts  it  to  a  plane  wave  reference  beam 
which  reads  out  a  second  hologram.  The  second  hologram 
is  recorded  in  the  same  setup  as  the  first  using  the  same 
objects  and  reference  beam.  The  second  hologram  there¬ 
fore  reconstructs  the  associated  object  when  addressed  by 
the  thresholded  reference  beam.  During  recording  each 
hologram  is  optimized  for  its  particular  function.  The  rel¬ 
ative  intensities  of  the  reference  and  object  beams  were 
adjusted  during  recording  of  the  first  hologram  to  empha¬ 
size  high  spatial  frequencies  in  the  object.  This  tended  to 
orthogonalize  the  objects  and  increase  the  autocorrelation 
peak  relative  to  its  sidelobes  and  cross-correlations.  The 
second  hologram,  on  the  other  hand,  was  recorded  with 
diffuse  illumination  to  improve  the  display  quality  when 
it  is  addressed  by  a  restored  plane  wave  reference  beam. 
The  combination  of  object  thresholding,  orthogonaliza- 
tion,  and  display  optimization  (which  was  made  possible 
by  the  separation  of  recognition  and  reconstruction  func¬ 
tions  between  the  two  holograms)  greatly  improved  the 
reconstruction  quality,  as  shown  in  Fig.  14. 

4)  Optical  Fibers  and  Mirrors:  An  alternative,  but 
closely-related  approach  to  thresholding  the  correlation 
plane  is  the  use  of  optical  fibers  coupled  to  mirrors  to 
retroreflect  the  central  peak  of  the  correlation  function 
back  to  the  hologram.  This  approach  was  demonstrated 


<d)  <e> 


Fig.  14.  Pinhole  array  optical  associative  loop:  (a)  four  stored  memories: 
reconstructed  images  from  (b)  the  first  hologram  and  (c)  the  second  ho¬ 
logram.  and  (d)  partial  input  and  (e)  recalled  output  (After  (41).) 

by  Yariv,  Kwong,  and  Kyuma  [42].  In  their  experiment, 
shown  in  Fig.  15,  two  objects  were  recorded  in  a  volume 
holographic  material  using  angularly-multiplexed  refer- 
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Fig.  13.  Pinhole  array  optical  associative  loop.  (After  (41).) 
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Fig.  15.  Experimental  arrangement  of  associative  mcm.'O  using  feedback 
from  optical  fibers.  (After  (42].) 

ence  beams.  Taking  into  account  the  volume  nature  of  the 
holographic  medium  and  assuming  that  the  induced  index 
variations  are  linearly  proportional  to  the  optical  expo¬ 
sure,  the  volume  index  variation  can  be  written  as 
.v 

An  oc  Ti  {E*Eio  +  c.c.)  (16) 

I  =*  l 

where  E,  are  the  recorded  objects  E,0  are  the  angularly 
multiplexed  reference  beams,  and  N  is  the  total  number 
of  recorded  objects.  When  the  hologram  is  addressed  by 
a  partial  input  object  £’,  the  diffracted  field  is  given  by 

«r)  «  S  (  X'(r')x;(r')^(r')e-Wr',“'J'’« 

J  VV 

■dx’dy'dz'  (17) 

where  the  integral  is  pei  formed  over  the  volume  of  the 
hologram  and  r  =  |  r|,  and  r  is  a  point  in  the  observation 
plane.  The  /4’s  are  the  slowly  varying  amplitudes  of  the 
input  object,  stored  objects,  and  reference  beams.  The 
above  quantity  represents  a  sum  of  distorted  versions  of 
the  original  plane  wave  reference  beams.  It  is  analogous 
to  (3),  which  was  derived  for  a  thin  hologram.  When  these 
beams  are  spatially  filtered  and  retroreflected  in  the  cor¬ 
relation  plane  the  result  is  a  sum  of  plane-like  waves  prop¬ 
agating  back  along  the  direction  of  E°’  with  complex  field 
amplitudes  proportional  to  the  overlap  integral 

7,(r)=  ^A'(r')A?(r')Ai0(r')dx'dy'dz'.  (18) 

The  above  overlap  integral  is  analogous  to  the  mner-prod- 
uct  formed  in  the  thin  hologram  case  when  the  correlation 
function  is  sampled  at  its  central  peak.  It  is  a  measure  of 
the  similarity  of  the  input  object  to  the  stored  objects.  The 
set  of  retroreflected  plane  wave  references  is  given  by 

£ren«S£,U.  (19) 

If  a  nonlinearity  is  used  to  enhance  the  strongest  J,  and 


completely  suppress  the  weaker  ones,  and  if  this  7,  is  al¬ 
lowed  to  illuminate  the  hologram,  the  reconstructed  out¬ 
put  will  be  given  by 

£rtcons('-)  *  •/jMtofyV)  (20) 

which  is  proportional  to  the  conjugate  of  the  stored  object 
Aj(r).  Therefore,  with  the  proper  nonlinearities  in  the 
correlation  domain,  a  volume  hologram  NHAM  will  dis¬ 
play  the  one  stored  object  that  has  the  largest  spatial  over¬ 
lap  integral  with  the  input  object. 

In  the  system  shown  in  Fig.  15,  Yariv,  Kwong,  and 
Kyuma  used  optical  fibers  to  snmple  the  peak  in  the  cor¬ 
relation  plane  and  generate  the  Jt.  The  opposite  ends  of 
the  fibers  were  butted  against  mirrors  which  retroreflected 
the  light  back  to  the  hologram.  Since  the  fiber  ends  were 
located  in  the  back  focal  planes  of  correlation  lenses,  re¬ 
constructed  plane  wave  reference  beams  illuminated  the 
hologram.  (This  spatial  filtering  technique  is  conceptually 
identical  to  the  pinhole-mirror  technique  used  by  Paek 
and  Psaltis.)  Experimental  results  for  storing  two  overlap¬ 
ping,  nonorthogonal  objects  using  this  system  are  illus¬ 
trated  in  Fig.  16.  In  a  modification  of  the  system,  the  mir¬ 
rors  were  replaced  with  a  conjugating-thresholding 
element.  This  element  consisted  of  a  bistable  oscillation 

[43]  using  a  ring  resonator  passive  phase  conjugate  mirror 

[44] .  The  bistable  oscillator  utilizes  mode  competition  to 
selectively  enhance  the  strongest  mode  at  the  expense  of 
weaker  ones  and  retroreflect  it  back  to  the  hologram.  The 
bistable  oscillator  was  added  to  the  NHAM,  as  shown  in 
Fig.  17,  to  further  enhance  the  discrimination  between 
reconstructed  reference  beams.  Experimental  results  using 
this  thresholding  system  are  also  shown  in  Fig.  16. 

5)  Pinhole  Array  and  PCM:  White,  Aldrige,  and 
Lindsay  [45]  have  constructed  an  NHAM  which  utilizes 
a  pinhole  array  and  PCM  combination  for  thresholding 
the  correlation  plane.  Their  system  is  illustrated  in  Fig. 
18.  The  correlation  plane  is  sampled  using  a  fixed  pinhole 
array  in  a  manner  similar  to  that  of  Paek  and  Psaltis,  but 
the  restored  reference  beam  is  retroreflected  back  along 
its  path  to  the  hologram  using  a  PCM  rather  than  an  or¬ 
dinary  mirror.  The  PCM  consists  of  DFWM  in  BaTi03 
which  results  in  the  system  having  net  optical  gain.  The 
storage  medium  consisted  of  Fourier  transform  holograms 
in  dichromated  gelatin.  In  their  experiments,  two  objects 
were  sequentially  recording  using  angularly-multiplexed 
plane  wave  reference  beams,  during  recording  the  refer¬ 
ence-object  beam  ratio  was  adjusted  to  enhance  the  high 
spatial  frequencies  of  the  object,  resulting  in  edge  en¬ 
hancement.  This  edge  enhancement  sharpened  the  auto¬ 
correlation  peaks  and  improved  object  discrimination. 

Their  experimental  results  are  shown  in  Fig.  19.  Each 
object  consisted  of  four  geometric  shapes.  The  only  com¬ 
mon  element  between  the  two  objects  was  a  circle  in  the 
lower  left  quadrant.  As  shown  in  Fig.  19(a)  and  (b),  if  a 
unique  subset  of  an  object  addressed  the  NHAM,  a  com¬ 
plete,  albeit  edge-enhanced,  version  of  that  object  was  re¬ 
constructed.  When  the  circle  addressed  the  hologram  [Fig. 
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Fig.  16.  Associative  memory  using  feedback  from  optical  fibers:  (a)  stored 
image  £,,  (b)  image  £,  diffracted  otf  the  hologram  by  a  plane  wave  input 
at  plane  P,  (c>  partial  input  image  £J.  (d)  retrieval  of  the  stored  image 
£ |  by  the  partial  image  E[  using  the  system  of  Fig.  15:  (e)  retrieval  of 
the  stored  image  £|  by  the  partial  image  £i  using  the  system  of  Fig.  17: 
(f)  stored  image  £..  tgj  image  £»  diffracted  off  the  hologram  by  a  plane 
wave  input  at  plane  P.  (h)  partial  input  image  E\.  |ij  retrieval  of  the 
stored  image  £.  by  the  partial  image  E':  using  the  system  of  Fig  15:  (j) 
retrieval  of  the  stored  image  £;  by  the  partial  image  £1  usr  he  system 
of  Fig.  17.  (After  (42|.) 

19(c)],  the  correlations  with  the  two  memories  were  equal 
and  a  superposition  of  the  two  stored  objects  was  recon¬ 
structed  rn  order  to  test  the  discrimination  of  the  NHAM 
the  symii..try  was  broken  by  including  additional  subob¬ 
jects  to  favor  one  of  the  memories  (Fig.  20).  This  did  tend 
to  enhance  the  memory  with  the  larger  correlation,  but  the 
discrimination  was  not  complete  as  a  faint  image  of  the 
other  memory  can  still  be  seen.  The  authors  attributed  this 
to  a  lack  of  nonlinearity  in  the  PCM,  since  the  reflectivity 
of  DFWM  is  essentially  linear  for  low  probe  beam  inten- 


Fig.  17  Associative  memory  using  bistable  oscillator  based  on  passive  ring 
phase  conjugate  mirror.  (After  (42).) 
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Fig  18.  Associative  memory  using  pinhole  array  and  phase  conjugate  mir 
ror  based  on  four-wave  mixing  in  BaTiO,.  (After  |45|.) 


sities  (no  pump  depletion  regime).  No  object  was  recon¬ 
structed  when  the  input  consisted  of  a  geometric  shape  not 
present  in  either  of  the  memories  [Fig.  19(d)]. 

6)  Nonlinear  Etalons:  A  novel  method  of  thresholding 
the  correlation  plane  in  an  NHAM  was  demonstrated  by 
Wang  et  al.  [46].  The  NHAM  system  was  similar  to  the 
single-pass  systems  described  above  with  a  dichromated 
gelatin  holographic  storage  element  except  that  the 
thresholding  element  was  a  ZnS  bistable  etalon.  As  shown 
in  Fig.  21,  holding  beams  were  used  to  bias  the  etalon 
just  below  the  threshold  point  where  it  would  switch  from 
nontransmitting  to  transmitting.  The  etalon  was  posi¬ 
tioned  in  the  back  focal  plane  of  the  correlation  lens.  If 
the  peak  of  the  autocorrelation  function  was  sufficient  to 
switch  the  etalon,  the  holding  beam  at  that  point  would 
be  transmitted.  Since  the  holding  beams  were  aligned  to 
be  counterpropagating  with  the  reference  beams,  the 
transmitted  holding  beam  read  out  the  hologram  and  re¬ 
constructed  the  associated  image.  The  need  for  a  PCM 
was  therefore  avoided.  Both  auto-  and  heteroassociation 
could  be  implemented  by  directing  the  holding  beams  to 
the  same  or  different  holograms.  Associations  of  two 
stored  fingerprint  images  have  been  demonstrated.  Wang 
et  al.  have  discussed  various  practical  limitations  of  this 
approach,  including  the  high  power  requirements  and 
nonuniformity  of  the  ZnS  etalon.  Moreover,  a  PCM  or 
pseudoconjugator  would  have  to  be  added  on  the  object 
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Fig.  19.  Associative  recall  (right)  for  single  test  inputs  (left)  using  asso¬ 
ciative  memory  system  of  Fig.  18:  (a)  Maltese  cross;  (b)  diagonal  cross: 
(c)  circle:  (d)  hexagon  (not  in  training  set).  (After  (45).) 


side  of  the  hologram  in  order  to  form  a  multipass  reso¬ 
nator. 

C.  Ring  Resonator  NHAM  Configurations 

An  alternative  type  of  optical  associative  memory  is  the 
ring  resonator  NHAM  described  and  demonstrated  by  An¬ 
derson  [47].  In  the  ring  resonator  NHAM.  the  reference 
beam  for  recording  the  hologram  is  derived  from  the  ob¬ 
ject  beam  in  a  ring  configuration,  as  shown  in  Fig.  22, 
After  the  hologram  is  recorded,  each  stored  pattern  de¬ 
fines  an  eigenmode  of  the  resonator  in  the  same  manner 
as  for  the  linear  resonator  NHAM’s  described  previously. 
An  association  is  made  by  injecting  a  portion  of  the  orig¬ 
inal  pattern.  A  gain  medium  inside  the  resonator  amplifies 
the  eigenmode  with  the  largest  overlap  with  the  injected 
field.  The  other  eigenmodes  are  suppressed  by  a  gain 
competition  mechanism. 

Anderson  and  Saxena  [48]  have  performed  a  perturba- 
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Fig.  21.  Associative  memory  apparatus  using  nonlinear  ZnS  etalons.  (Aftei 

(46).) 


tive  analysis  of  the  evolution  of  the  fields  inside  the  ring 
resonator  NHAM.  In  their  analysis  the  equation  of  motion 
for  eigenmode  n  is 

fn  =  <xnI„  -  dnnl2n  ~  £  dnmlnIm  (21) 

n  *  m 

where  8n  is  a  linear  gain  coefficient,  6nn  is  a  self-saturation 
coefficient  describing  how  much  the  presence  of  a  mode 
suppresses  itself,  and  is  a  cross-saturation  coefficient 
indicating  to  what  degree  one  mode  suppresses  another. 
The  cross-saturation  term  is  proportional  to  the  mode  in¬ 
tensity  overlap  integral: 


J  gam  volume 


\Un(r)\2\Um(r)\2.d}r  (22) 


where  Un(r)  is  the  amplitude  distribution  of  mode  n.  For 
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Fig  22  Holographic  nng  resonator  memory  (a)  recording  of  hologram, 
(b)  recall  by  injected  signal  Gain  is  supplied  by  a  pumped  photorefrac- 
tive  medium.  (After  (47).) 


the  case  of  two  stored  eigenmodes,  the  gain  competition 
between  modes  is  described  by  the  ratio  of  cross-  to  self¬ 
saturation: 


012021 

01.022' 


(23) 


If  C  «  1,  then  overlap  between  different  modes  is  low, 
competition  is  weak,  and  one  mode  does  not  influence  the 
other.  If  C  »  1 ,  then  competition  is  strong  and  one  mode 
will  dominate  over  the  other.  Anderson  and  Saxena’s  the¬ 
oretical  results  indicate  that  for  a  gain  medium  based  on 
photorefractive  two-wave  mixing  in  barium  titanate,  Ccan 
be  at  most  1 .  The  ring  resonator  NHAM  is  adjusted  until 
C  is  approximately  1  for  all  pairs  of  modes.  The  compe¬ 
tition  between  modes  can  then  be  biased  with  an  injected 
signal.  Anderson  and  Erie  [49]  have  demonstrated  this 
concept  using  both  simple  plane  waves  and  printed  char¬ 
acters  as  the  recorded  eigenmodes.  An  example  of  an  im¬ 
age  stored  in  a  ring  resonator  NHAM  is  shown  in  Fig.  23. 

This  approach  is  different  from  the  previously-de¬ 
scribed  NHAM  architectures  in  that  the  reference  beam  is 
derived  from  the  object  beam  for  recording  the  hologram. 
During  readout  no  separate  thresholding  is  performed  on 
the  reconstructed  reference  beam.  Instead  a  nonlinear  gain 
competition  mechanism  is  relied  on  to  favor  one  recon¬ 
struction  over  other  possible  ones.  This  results  in  a  sim¬ 
pler  design  and  automatic  generation  of  reference  beams 
for  recording,  but  at  the  cost  of  losing  some  of  the  flexi¬ 
bility  and  storage  capacity  advantages  of  the  plane  wave 


(a)  (b) 


Fig.  23.  Image  storage  and  recall  in  the  ring  resonator  of  Fig.  22:  (a)  out¬ 
put  of  resonator  during  writing;  (b)  output  of  resonator  during  recall 
without  an  injected  signal.  (After  (49).) 


IV.  Discussion 

i 

Nonlinear  holographic  associative  memories  represent 
a  novel  innovation  on  older  linear  holographic  memories.  ( 
Nonlinearities  and  feedback  improve  the  reconstruction 
quality  compared  to  ghost  image  holography,  but  beyond  1 
that  they  make  possible  new  optical  computation  tools, 
such  as  image  vector  quantization  and  programmable  het¬ 
eroassociations.  These  operations  can  be  implemented  on 
large  scale-bandwidth-product  images  with  the  parallel¬ 
ism  characteristic  of  optics.  Potential  applications  include 
multiple  target  identification  and  optical  computing  using 
symbolic  substitution  [50].  Modified  versions  of  these  ar¬ 
chitectures  may  have  applications  in  optical  neural  net¬ 
work  computers  [51].  The  parallelism  and  large  intercon-  j 
nectivity  make  NHAM’s  especially  attractive  for  this  }: 
application. 

It  is  interesting  that  most  of  the  experimental  systems  , 
reviewed  did  not  utilize  the  nonlinearity  of  ohotorefrac-  I 
tive  PCM’s  to  improve  storage  capacity  and  perform  er¬ 
ror-correcting  associations.  In  most  cases  the  PCM’s  were 
used  as  linear  phase  conjugating  elements  only  and  exter-  j 
nal  supplemental  nonlinearities  were  added.  The  external  1 
nonlinearities  included  pinhole  arrays,  optical  fibers,  bi¬ 
stable  ring  resonators,  nonlinear  etalons,  and  electronic 
lookup  tables.  Even  when  a  thresholding  PCM  based  on 
a  bistable  ring  resonator  was  used,  it  was  supplemented 
by  spatial  filtering  using  optical  fibers.  The  use  of  external 
nonlinearities  is  due  to  experimental  difficulties  in  con-  , 
trolling  the  nonlinear  reflectivity  of  a  PCM.  For  example, 
a  photorefractive  PCM  based  on  four-wave  mixing  using 
external  pumps  can  have  a  nonlinear  reflectivity  when  op¬ 
erated  in  the  pump  depletion  regime  [52].  However,  since  , 
pump  depletion  is  a  nonlocal  effect,  the  threshold  level  of 
an  incident  beam  is  affected  by  other  incident  beams.  (In  •  j 
addition  the  reflectivity  of  a  self-pumped  photorefractive  j 
PCM  is  a  function  of  the  angle  of  incidence.)  This  inter¬ 
action  between  beams  makes  control  of  the  optical  non- 
linearities  difficult  using  only  a  photorefractive  PCM,  \\ 
making  external  nonlinearities  a  practical  necessity  for  I 
consistent  results.  Pepper  has  discussed  an  alternative 
method  for  thresholding  and  conjugating  an  optical  wave-  | 
front  in  an  NHAM  which  uses  a  PCM  for  conjugation  and  J 
a  liquid  crystal  light  valve  for  controllable  external 


reference  based  NHAM  described  in  Section  II-B.  thresholding  [53]. 
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The  experimental  systems  discussed  here  demonstrate 
the  potential  of  NHAM’s  but  they  are  also  evidence  of  the 
immature  state  of  NHAM  implementations  to  date.  Be¬ 
sides  finding  the  optimum  nonlinear  mechanism,  issues 
remaining  include  demonstrating  better  image  quality, 
larger  storage  capacity,  and  programmability.  Permanent 
storage  by  fixing  of  holograms  in  photorefractive  mate¬ 
rials  is  an  important  issue,  although  much  work  has  al¬ 
ready  been  done  in  this  area  [54]— [56] .  Interfaces  to  con¬ 
ventional  electronic  host  computers  need  to  be  developed 
for  these  systems  to  become  practical.  In  order  to  imple¬ 
ment  higher  order  tasks  (such  as  rotation  and  scale  in¬ 
variant  recognition  of  patterns)  NHAM  modules  need  to 
be  incorporated  in  general  purpose  optical  neural  network 
architectures.  Nevertheless,  these  first  generation  systems 
have  demonstrated  several  design  principles  which  will 
doubtless  be  incorporated  in  future  optical  associative 
memory  and  neural  network  processors. 
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Y.  Owechko,  B.  H.  Soffer,  and  G.  J.  Dunning 

Hughes  Research  Laboratories 
Malibu,  CA  90265 


Abstract 


We  describe  an  optoelectronic  resonator  associative  memory  system  which  util  zes 
holographic  interconnects.  Image  processing  techniques  are  used  to  implement 
non  I i near i ti es  and  feedback.  We  show  using  numerical  models  that  both  power  aw  and 
sigmoidal  non  I i near i ti es  improve  tne.  storage  capacity.  Our  experimental  results  lead  us  to 
be  optimistic  that  this  hybrid  opt i ca I /e I ectron i c  approach  can  be  extended  to  aoapt  «e 
neural  network  models. 


1.0  Introduction 


The  se I f-organ i z i ng ,  adaptive  features  of  neural  network  models  developed  by 
biologists  and  mathematicians  has  in  recent  years  piqued  the  interest  of  engineers  wro  a.  j 
interested  in  applying  them  to  problems  In  signal  processing,  pattern  recogn i t ■ on ,  and 
mu  1 1 i -var i ab I e  optimization  (1).  Neural  network  models  offer  a  data-driven  unsuoenvsed 
computational  approach  which  is  complementary  to  the  algorithm-driven  approaches  of 
traditional  information  processing  and  artificial  intelligence.  The  fine  granu , a r . ty , 
massive  i nte rconnect i v i ty ,  and  high  degree  of  parallelism  set  neural  networx  mode's  apart 
from  traditional  electronic  serial  computing.  These  same  features  are  the  ha>  Imamus  of 
optical  computing  a rch i tectu res  which  have  led  many  workers  to  consider  optical 
implementations  of  neural  network  models  (1-12). 

As  reported  m  (2-4),  we  have  constructed  and  demonstrated  a  resonator-based  non  inear 
holographic  associative  memory  (NHAM)  which  can  be  described  as  an  optical  neural  network. 
A  diagram  of  a  generic  NHAM  is  shown  in  Fig.  1.  In  this  paper  we  describe  a  •’ybr.d 
opt i ca I /e I ectron i c  version  of  the  associative  memory  in  which  the  non  I  inear, t  es  are 
implemented  electronically.  We  also  discuss  some  initial  numerical  results  from  computer 
simulations  which  show  the  effects  of  various  non  I i nea r i t i es  on  NHAM  performance. 

The  all -optica  I  NHAM  reported  in  (2)  consisted  of  a  hologram  situated  in  a  prase 
conjugate  resonator  cavity  formed  by  two  phase  conjugate  mirrors  (PCMs) .  The  PCMs  were 
formed  by  four  wave  mixing  in  BaTi03.  An  intra-cavity  thermoplastic  hologram  def>rec  the 
se ! f-cons i stent  low-loss  transverse  modes  of  the  resonator.  These  modes  corresoonq  to 
images  stored  in  the  hologram.  Several  images  were  recorded  as  superimposed  Four  er 
transform  holograms,  each  with  a  unique  angularly  shifted  plane  wave  reference  beam  iwh.cn 
corresponds  to  spatially  separated  delta  functions  in  the  input  plane).  If  the  ho>ogram 
was  subsequently  addressed  by  a  partial  or  distored  version  of  one  of  the  stored  images,  a 
set  of  distorted  reference  beams  was  reconstructed .  The  oscillation  threshold  of  tne  NHAM 
and  the  non  I  inear  reflectivity  of  the  phase  conjugate  mirrors  act  to  enhance  the  strongest 
reconstructed  reference  relative  to  the  weaker  ones.  The  stored  image  with  the  largest 
correlation  with  the  input  survives  at  the  expense  of  the  less  correlated  images  A 
method  for  adjusting  the  threshold  level  of  a  PCM  was  reported  in  (13).  These  non. inear 
mechanisms  perform  functions  analogous  to  "winnei — take-all"  competitive  neural  networks. 
The  output  of  the  associative  memory  after  presentation  with  a  distorted  input  is  an 
undistorted  version  of  the  input. 

The  storage  capacity  of  such  a  nonlinear  associative  memory  was  shown  . n  (3)  to  be 
superior  to  a  linear  holographic  associative  memory  when  a  power  law  nonlinearity  s  used 
in  the  correlation  domain.  These  results  are  reviewed  in  Section  2  and  extenaed  to 
sigmoidal  non  I  i nea r 1 1 1 es  using  numerical  simulations.  They  indicate  that  the  theoret  ca i 
storage  capacity  of  an  NHAM  can  be  much  greater  than  outei — product  or  simpie  cor-eiat  on 
matrix  formulations  of  associative  memory  because  of  the  superior  cross-ta1^  suppress  o" 
character i st i cs  of  the  NHAM. 

A  hybrid  oot i ca I /e I ectron i c  version  of  the  all-optical  NHAM  is  described  m  Sect  or  3. 
In  the  hybrid  NHAM  the  BaT i 03  based  phase  conjugate  mirrors  are  replacea  witn  v  aeo 
detectors  and  spatial  light  modulators  arranged  in  a  pseudo-conjugating  conf . gurat . o~ . 
Although  the  se I f -a  I  i gn i ng  feature  of  the  all-optical  phase  conjugate  resonator  s  ost 
with  this  change,  other  desirable  features  are  gained.  Greater  gain  is  possible  due  to  tne 
combination  of  the  electronic  gain  of  the  video  detector  and  the  optical  gain  of  the 
spatial  light  modulators  (in  this  case  Hughes  Liquid  Crystal  Light  Valves  (LCLV) )  Large 
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gains  are  desirable  since  the  diffraction  efficiency  of  the  hologram  becomes  less  as  more 
gratings  are  recorded.  The  hologram  diffraction  efficiency  is  an  optical  loss  wn*cm  must 
be  overcome  in  order  to  form  a  resonator.  Using  this  hybrid  approach,  we  have  demonstrated 
such  an  associative  resonator.  Another  feature  of  the  hybrid  resonator  associative  memory 
is  that  programmable  digital  video  processing  can  be  used  to  implement  nonlinearit  es  ana 
hete ro-assoc i a t i ve  operations.  The  non  I i near i t i es  are  point  operations  and  can  oe 
implemented  at  video  rates  using  fast  lookup  tables.  In  this  hybrid  approach  the  strengths 
of  optics:  linear  transformat i ons ,  massive  interconnectivity  and  parallelism;  ana  the 
strengths  of  electronics.  point  non  I i nea r i t i es  and  programmability;  are  both  used  to 
advantage . 

The  NHAM  can  be  interpreted  as  a  single  layer  optical  neural  network  m  whic1'  the 
interconnection  weights  are  established  permanently  and  non-adapt i ve I y  during  recording  of 
the  hologram.  Feedback  is  used  during  readout  but  not  in  the  recording  of  the  weights.  A 
hybrid  opto-e I ectron i c  two-layer  neural  network  is  described  in  Section  4  in  whicn  t^e 
weights  can  be  adjusted  adaptively.  This  system  is  a  stra i ghtforwa rd  extension  of  the 
hybrid  NHAM  which  uses  photoref ract i ve  crystals  as  the  holographic  storage  medium 

2.0  NHAM  Storage  Capacity 

The  storage  capacity  of  the  NHAM  is  limited  by  such  factors  as  the  resolution,  area, 
and  dynamic  range  of  the  holographic  storage  medium  and  the  overall  system  gain.  A  more 
fundamental  limitation,  however,  which  is  independent  of  such  material  issues,  s 
correlation  noise.  Correlation  noise  is  especially  bothersome  for  an  NHAM  which,  in  oraer 
to  maintain  sh i f t- i nvar i ance ,  is  based  on  a  thin  hologram.  The  root  cause  is  cross-ta i < 
between  non-orthogona I  stored  image  vectors  and  it  is  similar  to  the  storage  limitation 
mechanisms  in  the  outer-product  matrix  type  associative  memories  which  have  been  descrioea 
by  many  workers.  Fortunately,  correlation  noise  can  be  greatly  reduced  in  the  reference- 
based  NHAM  by  utilizing  the  proper  non  I i nea r i t i es  in  the  correlation  domain.  The  effects 
of  such  non  I i nea r i t i es  will  be  described  in  this  section  using  examples  from  numercai 
s i mu  I  a t i ons . 

A  block  diagram  of  an  NHAM  from  which  we  will  derive  an  iterative  equation  for  the 
NHAM  resonator  is  shown  in  Fig.  2.  The  operators 

Ha_0  and  Hb„a  represent  forward  and  backward  paths  through  the  hologram.  The  functions  h() 
and  f  ()  represent  point  non  I  i  nea  r  i  t  i  es  in  --me  image  or  object  domain  and  in  the  reference 
or  correlation  domain,  respectively^  Based  on  this  diagram,  an  iterative  equation  can  oe 
written  for  the  object  distribution  A„  (x)  after  the  n-th  iteration  around  the  loop  based  on 
the  previous  iteration  distribution  An.t(x): 


A  =  G  (  A  .] 
n  1  n-1  J 

where 

(m)  (m) 

G  [  An  ]  =  h  <  E  f  (  An  ©  A  )  «  A  > 


(In  this  paper  0  and  «  denote  correlation  and  convolution,  respect i ve I y  . )  We  derived  the 
above  equation  by  assuming  a  thin  hologram  and  angularly  shifted  plane  wave  reference 
beams,  which  correspond  in  the  correlation  domain  to  references  which  are  spatially 
shifted  delta  functions.  The  correlation  domain  nonlinearity  f ()  operates  on  each  term 
separately  because  the  terms  are  spatially  separated  due  to  the  angular  multiplexing  of  the 
reference  beams.  The  stored  images  A^m^  are  the  eigenfunctions  of  the  operator  G,  e.g. 

G  [  A(m)  ]  =  A^m) 


The  correlation/convolution  operations  inherent  in  G  serve  to  "recognize" 
system  as  members  of  the  stored  set  of  images.  These  operations  are  also 
capac i ty- I i m i t i ng  correlation  noise  when  non-orthogona I  images  are  stored, 
functions  f ()  and  h ()  can  be  used  to  reduce  the  correlation  noise. 

In  (3)  we  showed  that  when  the  correlation  domain  nonlinearity  f ()  is  of 


innuts  to  t ho 
the  source  of 
The  non  I  i nea  r 


the  form 


f  (  x  )  =  xn 
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then  the  storage  capacity  in  terms  of  the  number,  M,  of  images  that  can  be  stored  with  an 
arbitrary  "acceptable"  level  of  cross-talk  is  proportional  to 


M 


Nn~ 


1 


where  N  is  the  sice  of  the  stored  image  vectors  This  result  was  derived  assuming  - a - o c ~ 
non-orthogona i  binary  image  vectors  ana  it  was  verified  using  computer  simuiat'Ors.  T-e 
above  result  indicates  that  the  cross-taik  among  stored  vectors  can  be  -eaucea  to  a" 
arbitrarily  sma  i  va  ue  by  increasing  the  nonlinearity  of  the  correlation  doma  ■  n  *‘j-:t  q- 
f(). 


However,  in  physica'ly  realizable  systems  the  degree  to  which  this  can  oe  acnievec  s 
limited  oy  the  finite  dynamic  range  of  analog  systems.  Therefore,  we  nave  oe"": — ec 

computer  simulations  in  which  the  f Q  ana  n()  functions  are  s  gmo>d3'  (  ncorpc-a"  -g 

saturation)  and  noise  Is  aaded  to  tne  undated  Image  vector  after  each  ite-at‘or  or  -0,-3 
trip  through  the  NHAM  resonator  (s'muiating  limited  NHAM  oynamic  range) 

The  dynamic  behavior  of  the  NHAM  simulation  is  illustrated  in  Figs.  3a-c  using  o-ase 
diagrams.  Ir  the  phase  diagrams  the  horizontal  coordinate  represents  the  "distance"  of  tne 
current  state  of  the  system  from  the  target  image  vector.  The  vertical  coc-ai-ate 
represents  the  distance  in  the  next  iteration.  Distance  D(k)  in  the  k-th  iteration  s 
defined  here  as  l-cos(S)  where  cos(S)  s  the  direction  cosine  between  the  state  of  the 
system  and  the  target  image  vector.  We  use  the  direction  cosine  as  a  distance  measu-e 
rather  than  Hamming  distance  because  it  is  a  normalized  quantity  which  measures  the 

orientation  of  image  vectors  in  state  space  and  is  independent  of  the  vector  norm.  It  s  a 

better  measure  of  image  similarity.  The  dynamic  evolution  of  the  system  for  a  part  cuar 
initial  input  is  represented  by  a  series  of  points  which  head  toward  the  origin  when  tne 
system  successfully  converges  to  the  target  image  vector.  An  "equilibrium  I  ne"  wh.cn 
passes  through  the  origin  with  unity  slope  represents  the  projection  of  eau 1  librium  po‘-ts 
(possibly  unstable)  in  state  space  onto  the  phase  diagram.  If  the  output  of  the  NHAM 
evolves  to  an  eigenfunction  of  the  operator  G,  the  distance  D(k)  from  the  target  vector 
will  be  constant  for  succeeding  iterations  k,  hence  the  system  will  be  "stuck"  on  tne 
equilibrium  line.  Recall  of  false  or  incorrect  memory  states  is  represented  by  system 
trajectories  which  come  to  rest  on  the  equilibrium  line  anywhere  other  tnan  the  or.g  r 
Trajectories  which  monoton i ca I  I y  approach  the  target  vector  are  confined  be  ow  t-e 
equ i I i br i urn  line. 

In  all  of  the  following  examples  the  image  vectors  are  50  bit  long  binary  vectors 
whose  entries  are  +-1,  The  sigmoidal  non  I  inearity  in  the  correlation  plane  in  al  1  cases 
was 


10 

f  (  x  )  =  _  _ 

1  *  exp  [  0.23  (  x  -  44  )  ] 

which  set  the  correlation  threshold  level  at  44  (the  maximum  possible  correlation  peak 
value  was  50)  .  In  the  phase  diagram  shown  in  Fig.  3a  the  fol lowing  non  I  inearity  was  usea 
in  the  object  domain: 

1 

h  (  x  )  =  _ 

1+exp  [1.5x] 


In  all  of  the  following  examples  the  parameters  for  h ()  and  f ()  were  determined  empirically 
using  numerical  "experiments."  No  op* ' m '  •  a  .  . ons  were  done.  A  total  of  100  random  50  b't 
long  image  vectors  was  generated  and  st>.  .0  in  the  NHAM  operator  G.  The  input  vecto-  was 
generated  by  reversing  nine  of  the  bits  in  a  randomly  chosen  stored  vector.  As  evidenced 
by  the  eventual  path  of  the  system  toward  the  origin,  the  target  vector  was  successful < y 
associated  with  the  distorted  input  vector.  In  this  case  M=?N  where  M  is  t^e  -u-be -  cf 
stored  vectors  and  N  is  the  vector  size.  Even  with  this  number  of  stored  vectors  an  error 
in  the  input  of  1855  (nine  bits  in  error)  was  successfully  corrected.  This  capacity  ana 
error-correct i on  ability  is  far  in  excess  of  outer-product  matrix-based  associative 
memories  where  M20.15N  (14,15).  Note  that  the  system  spent  several  iterations  ciose  to  the 
equilibrium  line  where  progress  toward  convergence  on  the  target  vector  is  slow.  Tne 
trajectory  can  be  pushed  away  from  the  equ i librium  line  and  faster  convergence  obta  ''ea  oy 
sharpening  the  object  domain  nonlinearity.  In  Fig.  3b  the  only  change  was  a  ~()  « ' tr  a 
slightly  sharper  threshold: 
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1 

f  (  X  )  =  _ 

1  *  exp  [  1 . 9  x  ] 


Note  the  faster  convergence  Finally,  m  Fig  3c  noise  was  added  after  each  iteration 
The  magnitude  of  the  -oise  was  10%  of  image  vector  magnitude.  In  this  case  the  system 
initially  started  to  d  ierge  until  a  random  perturbation  pushed  o~e  system  into  the  oasin 
of  attraction  of  the  target  vector 

3.0  Hybrid  NHAM 

A  block  diagram  of  the  hybrid  associat:ve  memory  is  shown  m  Fig.  4  and  a  detai  ec 
schematic  in  Fig.  5  As  in  the  all-optical  NHAM,  thin  Fourier  transform  holograms  were 
recorded  in  thermoplastic  film  Angular  multiplexing  of  the  reference  beams  acts  to 

separate  correlation  noise  f rom  the  desired  signal,  improving  the  efficacy  of  thresnoia  ng 

to  remove  the  correlation  noise  and  increase  the  signal  to  noise  ratio  of  the  reconstructea 
image.  The  number  of  interconnection  weights  that  can  be  stored  in  a  thin  hologram  is  much 
less  than  in  a  thick  hologram.  However,  because  of  the  sh i f t- i nvar i ance  of  the  Fourier 
transform,  the  relatively  small  number  of  i n te rconnect i ons  are  used  very  e f f • c • en t I y _ to 
implement  position  iodeoendent  pattern  recognition.  In  this  case  the  mapping  for  sn ' rt- 
inva.riance  is  built  Into  the  system  by  the  physics  of  lenses  and  diffraction.  In  a  true 
neural  network  with  adaptable  weights  the  system  would  have  to  "learn"  the  reau. r»o 
mappings  from  examples  supplied  by  its  environment  o-  an  external  teacher. 

Thresholding,  feedback,  and  gain  are  provided  electronically  by  two  sets  of  vidicon 

detectors,  cathode  ray  tubes  (CRTs),  and  LCLVs .  A  partial  or  distorted  input  image  is 

focused  onto  an  object  loop  vidicon  detector  which  transfers  the  image  to  an  LCLV  v'a  a 
CRT.  The  dashed  'ines  in  Fig.  5  indicate  conjugate  planes  whicn  are  in  one-to-one 
correspondence  with  unity  magnification  The  output  from  the  object  loop  LCLV  aoaresses 
the  hologram  and  reconstructs  distorted  versions  of  the  angularly  multiplexed  plane  wave 
reference  beams  used  in  recording  the  stored  images.  Each  of  the  original  de  1  ta  ru->ct  on 
references  is  convolved  with  the  correlation  of  its  respective  associated  object  w  tn  t">e 
input  object.  The  distorted  references  are,  therefore,  simply  the  correlation  functions  of 
the  inout  object  with  the  stored  objects,  each  of  which  comes  to  a  focus  or  a  j-  due 
subregion  of  the  correlation  plane.  The  locations  of  these  subregions  .n  the  corn© i at  or 
plane  are  determined  by  the  angular  sh.fts  of  the  reference  beams  used  during  recording  or 
the  hologram.  The  correlation  functions  are  focused  onto  a  reference  leg  vidicon  detector^ 
A  one-to-one  mapping  is  performed  between  points  on  the  detector  and  points  on  the  output 
of  the  LCLV  A  pseudo-conjugate  of  the  incident  reference  beam  is  generated  by  a'lgnng 

the  LCLV  m  the  back  focal  plane  of  the  correlation  lens.  The  activated  p>xeis  on  the 

LCLV  which  represent  the  thresholded  correlation  function  are  illuminated  by  a  readout 
beam  The  activated  spot  on  the  LCLV  is  converted  into  a  back -propagat i ng  undistorted 
reference  beam  by  the  correlation  lens.  This  restored  reference  beam  addresses  the 
hologram  and  reconstructs  its  associated  stored  object  as  a  real  image  which  focuses  on  the 
vidicon  detector  in  the  object  loop  of  the  resonator.  Again,  a  one-to-one  mapping  -s  made 
of  the  light  inc'dent  on  the  vidicon  to  the  readout  side  of  the  object  LCLV.  The  restored 

object  imaoe  is  then  directed  to  the  hologram,  closing  the  resonator  loop.  The  compned 

gain  of  the  v . d I con/CRT/LCLV  units  are  adjusted  to  overcome  the  optical  losses  of  the 
system.  General  nonlinear  feedback  functions  can  be  easily  implemented.  The  correlation 
functions  are  processed  in  electronic  form  usmg  an  image  processor  with  a  programmao  e 
digital  look-up  table  before  being  sent  to  the  reference  leg  LCLV. 

In  our  initial  experiment,  a  single  object/reference  pair  was  recorded  n  t-e 
ho'ogram  A  1  though  this  was  obviously  insufficient  to  demonstrate  d i sc - i m i na t ■ on  between 
objects,  it  does  serve  to  demostrate  that  the  resonator  has  a  stable  state  wh.ch  can  oe 
reached  only  if  the  injected  signal  is  sufficiently  similar  to  the  stored  image.  The 
results  are  shown  .n  Fig.  6  The  input  object  was  a  partial  version  of  the  stored  object, 
an  Air  Force  reso  ution  chart.  If  more  than  aoprox ■ mate  I y  50%  of  the  object  was  i-.,ectea 
Into  the  s> ste-,  -osorance  was  auhieved  and  the  system  would  latch  onto  the  stored  mage. 
The  system  would  stay  latched  after  removal  of  the  input,  demonstrating  bistao>lity 
Interrupting  the  c  rculatmg  beam  in  the  resonator  would  return  it  to  its  mitia>  le^o 
state  If  the  input  obiect  was  rotated  by  up  to  10°,  the  output  would  still  switch  to  ■ ts 
other  stable  state  T-e  output  would  be  an  und.storted  (unrotated)  version  or  the  storeo 
object.  Tho  system  would  not  recognize  the  object  if  it  was  rotated  more  than  10  , 
'ndicating  that  the  system  was  not  mereiv  be ■ ng  brought  above  threshold  by  noise. 
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4.0  Extension  of  NHAM  to  Adaptive  Neural  Networks 


The  opto-e I ectron i c  resonator  associative  memory  can  be  extended  to  implement  an 
adaptable  and  reconf i gurab I e  multi-layer  optical  neural  network  (ONN)  with  large  storage 
capacity  and  parallel  weight  update  capability.  A  block  diagram  of  the  system  is  shown  in 
Pig.  7.  A  two-d i mens i ona I  neural  activation  pattern  (object  OA)  addresses  subhologram  Hi 
and  reconstructs  another  activation  pattern  (reference  R) .  Reference  R  is  non i i nearly 
processed  and  then  shifted  so  that  it  addresses  a  second  subhologram  H2  and  reconstructs  a 
third  pattern  (object  OB)  Tne  two  subholograms  HI  and  H2  are  phys  caliy  adjacent  on  the 
same  substrate  and  form  the  link  weights  between  the  input/output  ayers,  OA  and  OB,  and 
the  hidden  layer,  R  The  hologram  substrate  is  a  volume  photoref r act i ve  crystal  such  as 
LiNb03  in  which  i ink  weights  can  be  continuously  reinforced  or  inhibited.  The  optical 
pathways  or  links  are  bidirectional  so  that  light  can  propagate  not  only  m  the  direction 
OA-R-OB  but  also  OB-R-OA.  An  error  signal  is  back-propagated  through  the  ONN  with  its 
phase  shifted  by  0  or  it  so  that  grating  which  contribute  to  a  large  error  signal  can  be 
enhanced  or  inhibited.  Thus  this  system  can  implement  an  optical  version  of  the  back- 
propagation  algorithm.  The  three  activation  patterns  and  two  subholograms  form  a  two- 
layer  optical  neural  network.  OA  is  the  input  activation  pattern,  R  is  a  "hidden"  layer, 
and  OB  is  the  output  layer. 

Although  the  number  of  interconnects  that  can  be  stored  is  proportional  to  the  volume 
of  the  hologram,  which  scales  as  the  linear  dimension  cubed;  the  number  of  possible 
interconnections  between  two  NxN  neural  planes  is  N4 ,  which  scales  as  the  linear  dimension 
to  the  fourth  power.  This  reflects  the  fact  that  each  grating  wavevector  can  be  read  out 
by  a  multiplicity  of  Input/output  wavevector  pairs,  which  can  result  in  unwanted  cross-taik 
between  neurons  (9)  .  This  is  i  I  lustrated  in  Fig.  8  which  shows  that  each  grating  wavevector 
Kg  can  be  read  out  by  a  set  of  input/output  wavevector  pairs  which  forms  two  cones  touching 
at  their  apexes.  In  other  words,  all  wavevector  pairs  lying  on  the  surfaces  of  the  two 
cones  satisfy  the  Bragg  condition  for  diffraction  off  the  grating  represented  by  Kg ,  which 
can  result  in  unwanted  cross-talk  between  the  Kg  "weights". 

Several  approaches  can  be  used  to  resolve  this  readout  ambiguity,  including  sampling 
of  the  neuron  plares  using  fractal  grids  (S) .  In  our  preferred  approach,  the  object 
wavevectors  are  free  to  vary  in  both  0  and  <p  (two-dimensional  pixel  arrays),  but  the 
reference  wavevectors  are  confined  to  a  plane  using  cylindrical  lenses  (one-dimensional 
line  pixel  arrays).  This  results  in  the  volume  filling  of  K  space  with  grating  wavevectors 
and  a  total  number  of  possible  interconnects  which  scales  as  the  linear  dimension  cubed. 
The  degrees  of  freedom  in  the  volume  hologram  are  then  matched  to  the  number  of  required 
interconnects  and  cross-talk  is  automatically  avoided.  This  arrangement  also  maps  we  to 
many  neural  network  models  in  which  a  number  of  neurons  in  one  layer  are  connected  to 
smaller  or  larger  numbers  of  neurons  in  succeeding  layers.  In  the  hybrid  NHAM  this  type  of 
partitioning  also  results  in  larger  gain  because  one-dimensional  pixel  lines  rather  than 
points  are  used  in  the  reference  plane.  The  pixel  lines  intercept  a  greater  fraction  of 
the  readout  beam  which  results  in  brighter  retroref I ected  reference  beams.  If  Nj  is  the 
number  of  pixelr  that  can  be  resolved  along  a  line  by  an  LCLV,  then  the  number  of 
non i nterf er i ng  interconnections  between  two  planes  is  Nt3  using  this  partitioning  method, 
which  is  the  same  as  the  fractal  partitioning  method. 

A  deta i led  drawing  of  the  proposed  optical  back-propagation  system  is  shown  in  Fig.  9. 
This  system  is  virtually  identical  to  the  opto-e I ectron i c  resonator  associative  memory 
system  descr'bed  in  the  previous  section  except  for  the  substitution  of  a  LiNb03  volume 
hologram  for  the  thin  tnermop I ast i c  film  hologram  and  the  addition  of  a  few  lenses  and  an 
SLM.  The  "top"  and  "bottom"  activation  patterns  OA  and  OB  are  located  side  by  side  in  the 
input  plane.  An  i ncoherent-to-coherent  conversion  is  performed  in  the  object  loop  using  a 
vidicon  detector  and  LCLV. 


5.0  Summary 

A  hybrid  opto-e I ectron i c  nonlinear  holographic  associative  memory  (NHAM)  was  described 
and  theoretical  and  experimental  results  discussed.  The  NHAM  consists  of  a  hologram  in  an 
opto-e I ectron i c  cavity.  Gain,  feedback,  and  nonlinear  processing  of  the  reference  beams  are 
provided  by  vidicon  detectors,  an  image  processor,  and  liquid  crystal  light  valves. 
Numerical  simulations  demonstrated  the  beneficial  effects  of  non  I  i nea r i t i es  m  the 
correlation  domain  on  tne  storage  capacity  and  object  discrimination  of  NHAM.  Operation  of 
the  system  as  a  resonator  was  experimentally  demonstrated.  The  error-correction  properties 
were  evident  as  the  input  image  could  be  rotated  over  a  range  of  10°  with  no  observable 
degradation  in  the  associated  output  image. 

A  design  for  a  hybrid  opto-e I ectron i c  resonator  neural  network  architecture  based  on 
volume  holograms  and  capable  of  learning  using  error  back-propagat i on  was  also  discussed. 
The  design  is  a  direct  extension  of  the  opto-e I ectron i c  nonlinear  holographic  associative 
memory.  The  use  of  spatial ly  multiplexed  subholograms  in  photoref ract i ve  crystals  should 
allow  the  implementation  of  a  multi-  layer  optical  neural  network  consisting  of  millions  of 
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neurons  with  potential  processing  rates  of  1x10°  interconnects  per  sec.  This  ootica 
neural  net  can  be  constructed  from  standard  components  and  would  not  redo  re  t^e 
development  of  new  devices  or  the  use  of  excessive  optical  power  levels.  The  use  of  v  ceo 
electronics  in  the  feedback  and  back-propagation  paths  simplifies  interfacing  to  an  outs  ae 
computer  host  and  allows  the  implementation  of  general  nonlinear  activation  functions  In 
this  hybrid  system,  optics  provides  the  massive  connectivity  and  parallelism  necessary  «  a 
neural  network,  while  electronics  provides  the  nonlinear  processing.  Both  are  therefore 
used  in  the  roles  to  which  they  are  best  suited.  Such  a  system  would  find  numerous 
applications  in  adaptive  information  processing  and  control  systems. 

This  work  was  supported  in  part  by  the  Air  Force  Office  of  Scientific  Research.  Tne 
authors  wish  to  thank  C.  DeAnda  for  expert  technical  assistance. 
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Figure  C-1.  Generic  NHAM 
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Figure  C*2.  Nonlinear  correlation  model  of  NHAM 
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Figure  C-7.  Weight  adaptation  and  readout  In  NHAM-based  neural  network 
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Holographic  Associative  Memories 
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Abstract 

A  Lyapunov  or  "energy"  function  based  on  Kosko’s  BAM  model 
of  associative  memory  is  derived  for  optical  associative  memories  based 
on  thin  holograms  in  a  nonlinear  cavity.  The  dynamic  behavior  is  illus¬ 
trated  using  computer  simulations. 

1.  Introduction 

Neural  network  implementations  of  associative  memory  have  a  wide  range  of  potential 
applications  including  content-addressable  memories  with  error  correction,  pattern  recognition, 
and  adaptive  sensory-motor  mappings  for  robotic  control,  among  others.  A  large  body  of  theo¬ 
retical  work  on  associative  neural  networks  performed  over  the  past  twenty  years  is  now  begin¬ 
ning  to  be  exploited  for  such  engineering  applications.  It  is  commonly  felt  that  conventional 
serial  computers  are  not  suitable  for  future  neural  network  applications  involving  large  numbers 
of  neurons  because  of  the  rapid  scaling  of  connectivity  and  weight  update  rates  with  problem 
size.  Practical  systems  employing  neural  network  algorithms  will  require  special  purpose  paral¬ 
lel  computers  onto  which  neural  network  models  can  be  directly  mapped. 

As  an  alternative  to  conventional  computers  which  lack  the  fine-grained  parallelism  and 
connectivity  required  by  neural  network  models,  much  work  has  recently  been  done  on  optical 
and  hybrid  optical/electronic  neural  networks.  The  very  high  storage  capacity,  connectivity,  and 
parallelism  of  optics  makes  such  systems  attractive  for  this  application.  In  particular,  nonlinear 
holographic  associative  memories  (NHAM)  have  enjoyed  a  high  degree  of  interest  and  activity 
in  recent  years.  These  systems  improve  the  associative  properties  of  ghost  image  holography 
pioneered  by  Gabor  [1],  Van  Heerden  [2],  and  Collier  and  Pennington  [3]  by  placing  the  holo¬ 
gram  in  an  optical  feedback  cavity  with  nonlinear  gain.  The  images  stored  in  the  hologram  then 
become  the  eigenmodes  of  the  cavity  and  form  stable  limit  points  of  the  system. 

In  this  paper  I  will  interpret  the  dynamics  of  thin  hologram  NHAMs  in  terms  of  a  neural 
network  model,  specifically  the  Bidirectional  Associative  Memory  (BAM)  model  of  Kosko.  I 
will  show  that  although  NHAM  systems  are  direct  optical  implementations  of  the  BAM  model, 
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the  limit  points  are  not  in  general  the  intended  stored  memories  unless  the  memory  patterns  in 
one  neural  layer  assume  a  particular  form.  This  restriction  is  removed  if  volume  holograms  are 
used  together  with  some  precautions  to  avoid  crosstalk,  although  the  natural  shift  invariance  of 
the  thin  hologram  NHAM  is  then  sacrificed.  I  will  use  computer  simulations  to  illustrate  NHAM 
dynamic  behavior. 

2.  NHAM  architecture 

The  basic  principle  behind  the  associative  properties  of  holograms  is  illustrated  in  Fig.  1 
which  depicts  a  highly  idealized  hologram  formation  process.  During  recording,  two  coherent 
optical  wavefronts,  spatially  modulated  by  transmittances  a  and  b,  are  Fourier  transformed  and 
interfere  in  a  photosensitive  medium,  forming  a  fringe  pattern  IA+BI2  where  A  and  B  are  the 
Fourier  transforms  of  a  and  b,  respectively.  (Nonlinearities  in  the  photoresponse  of  the  medium 
are  ignored.)  If  after  development  the  hologram  is  addressed  with  object  a  and  the  output  is 
Fourier  transformed  with  a  lens,  then  the  output  amplitude  is  given  by 

output  =  b*{a  ®  a)  (1) 

where  <8>  and  *  signify  correlation  and  convolution,  respectively.  This  result  holds  for  a  thin 
hologram  in  which  volume  diffraction  effects  can  be  ignored.  (Two  other  terms  are  also  pro¬ 
duced  but  they  are  spatially  separated  and  can  be  ignored.)  This  result  forms  the  basis  for  the 
early  work  in  ghost  image  holography:  if  a  has  the  proper  image  statistics  and  a  is  sufficiently 
similar  to  a,  the  correlation  a®a  will  resemble  a  delta  function  and  the  output  will  closely 
resemble  b,  forming  an  association  between  a  and  b.  Unfortunately,  for  many  images  the  auto¬ 
correlation  is  not  sufficiently  sharp  to  prevent  significant  degradation  of  the  output.  Moreover, 
when  attempts  are  made  to  store  multiple  objects  crosstalk  noise  further  degrades  the  output 
quality.  Another  problem  is  that  such  a  linear  system  is  incapable  of  choosing  between  compet¬ 
ing  outputs  so  that  superimposed  distorted  inputs  result  in  superimposed  distorted  outputs.  It  is 
also  very  difficult  to  cascade  such  linear  systems  because  of  the  buildup  of  distortions  and  noise 
from  stage  to  stage. 

In  order  to  circumvent  the  above  problems  we  [4]  and  others  [5]  [6]  [7]  [8]  [9]  [10]  [11] 
proposed  and  demonstrated  nonlinear  holographic  memories  (NHAM)  in  which  optical  feedback 
and  nonlinear  gain  are  used  to  choose  between  the  stored  memories.  The  physical  form  of  the 
retroreflection/nonlinearity  mechanism  used  by  various  workers  varies  from  all-optical  to  hybrid 
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optical/electronic  mechanisms.  A  diagram  of  one  type  of  NHAM  architecture  is  shown  in  Fig.  2. 
Angularly  shifted  plane  wave  reference  beams  are  used  to  record  a  set  of  objects  am.  Each  bm  is 
therefore  a  shifted  delta  function.  When  the  hologram  is  addressed  by  an  input  a,  the  output 

p=  I  bm*(an®a)  (2) 

m*  l 


is  passed  through  a  nonlinearity  f()  and  retroreflected  back  to  the  hologram,  forming  an  output 

a=  (3) 

m»  1 


which  is  in  turn  passed  through  a  separate  nonlinearity  F()  and  retroreflected  back  to  the  holo¬ 
gram  where  the  cycle  repeats,  forming  an  iterative  dynamic  system  which  converges  to  limit 
points  which  are  the  eigenmodes  of  the  optical  cavity  formed  by  the  retroreflectors  and  the 
hologram.  It  will  be  shown  in  Sec.  3  that  if  the  bm  are  shifted  delta  functions  (corresponding  to 
angularly  shifted  plane  wave  reference  beams)  the  limit  points  of  the  dynamic  system  correspond 
to  the  stored  a-b  associated  image  pairs. 

The  architecture  of  Fig.  2  can  be  described  in  flow  diagram  format  as  a  closed  loop  con¬ 
sisting  of  a  forward  linear  transformation,  point  nonlinearity,  backward  linear  transformation, 
and  another  point  nonlinearity  (Fig.  3).  If  we  assume  the  stored  memories  are  one-dimensional 
vectors,  the  forward  and  backward  linear  transformations  H*  and  can  be  described  by  matrix 
representations  of  the  linear  operations  listed  in  Eqs.  (2)  and  (3),  respectively.  (The  extension  to 
two-dimensional  images  is  straightforward  and  does  not  add  to  the  present  discussion.)  The 
form  of  the  matrix  H,.b  can  be  determined  from  inspection  and  is  illustrated  in  Fig.  4  for  angu¬ 
larly  multiplexed  plane  wave  reference  beams.  It  is  a  band  M(2N-l)xNt  matrix  where  each 
row  in  band  m  consists  of  a  shifted  version  of  am  and  m  is  an  index  for  the  stored  associative 
memory  vectors.  N  is  the  size  of  the  memory  vectors,  Nj  is  the  length  of  the  "window"  in  which 
the  input  vector  is  imbedded  (this  allows  for  translational  invariance),  and  M  is  the  total  number 
of  stored  vectors.  The  elements  of  the  forward  transformation  matrix  H,b  are  given  by 


M 


m  - 1 


mjV. 


(4) 
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The  backward  transformation  matrix  Hbl  is  similar  in  form.  If  the  input  vector  is  padded  with 
zeros  to  increase  its  length  to  N,=M(2N-1)  then  both  matrices  are  square  and  Hbt  is  equal  to  the 
matrix  transpose  of  H^: 


//*=/£  (5) 

The  above  formulation  of  NHAM  dynamics  is  formally  equivalent  to  the  bidirectional 
associative  memory  (BAM)  model  described  by  Kosko.  This  interpretation  of  NHAM  dynamics 
is  discussed  further  in  the  next  section. 

3.  BAM  interpretation  of  NHAM  dynamics 

Kosko’s  BAM  model[12]  is  illustrated  in  Fig.  5.  Two  fields  of  neurons  FA  and  FB  charac¬ 
terized  by  sigmoidal  or  hard  thresholding  activation  functions  are  connected  by  a  set  of  weights 
hy.  Patterns  activating  field  FA  are  thresholded,  weighted,  and  transmitted  to  FB  (bottom-up). 
Those  patterns  are  then  in  turn  thresholded  by  FB  and  transmitted  back  down  to  FA  via  the  same 
set  of  weights  (top-down).  This  sequence  then  repeats  in  ping-pong  fashion.  Kosko  has  shown 
that  the  function 


E(a,  p)  =  -(P tH  -  6>  -  (a tHt  -  0j)p 


(6) 


always  decreases  as  the  system  evolves.  In  the  above  expression  (a,p)  are  column  vectors 
representing  the  patterns  in  (FA,FB)  and  (0,T,9bT)  are  threshold  levels.  Since  E  is  bounded  from 
below  it  is  an  "energy"  or  Lyapunov  function  and  NHAM  can  be  modeled  as  a  nonlinear  dissipa¬ 
tive  system.  The  minima  of  E  correspond  to  stable  limit  points.  The  only  necessary  condition  on 
the  connection  matrix  is  that  H  (top-down)  is  the  transpose  of  H  (bottom-up).  This  corresponds 
to  bidirectional  weights,  e.g.  the  same  weight  connects  neurons  i  and  j  in  both  directions.  Kosko 
showed  that  the  limit  points  correspond  to  stored  associative  memory  pairs  am  and  bm  if  the 
connection  weights  are  given  by  a  sum  of  outer  products: 


u 


h;j=  i,o; 


m  =  1 


(7) 


The  BAM  formalism  can  be  applied  directly  to  analyzing  NHAM  dynamics  since  BAM 
dynamics,  as  evident  from  Fig.  3,  is  identical  in  form  to  that  of  the  NHAM  framework.  The 
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NHAM  Lyapunov  function  is  therefore  given  by  the  above  expression  for  E  if  H  is  the  matrix 
illustrated  in  Fig.  4  which  describes  the  linear  transformation  performed  by  the  hologram.  In 
general,  the  NHAM  limit  points  are  not  (am,bm)  because  H  is  given  by  a  combination  of  correla¬ 
tion/convolution  operations  as  opposed  to  the  simple  sum  of  outer  products  of  the  BAM.  The 
convolutions  superimpose  multiple  blurred  output  terms  which  cannot  be  deblurred  using  simple 
point  nonlinearities  and  feedback.  However,  for  the  special  case  of  angularly  multiplexed  plane 
wave  reference  beams,  the  b  vectors  are  spatially  shifted  delta  functions  which  separate  the 
various  correlation  terms,  allowing  point  nonlinearities  to  favor  the  strongest  correlation  peak. 
This  can  be  more  easily  seen  if  E  is  rewritten  in  the  correlation/convolution  format: 


£(oc,P)  = 


M 


<8>  p)  -  ea I  a  -  IS  bM*(am  ®  a)  -  0P I  P 


(8) 


The  above  expression  is  locally  minimized  if  (a,P)  equal  one  of  the  stored  associations  and  the 
bm  are  delta  functions, 


bn  =  S(x-xJ 
a  ~am 

P  =  bmo  (9) 

since  the  cross-correlation  noise  is  then  separated  out  in  both  the  a  and  P  domains,  converting 
the  above  expression  for  E  into  a  sum  of  two  inner  products  which  is  minimized  for 
(a,p)=(amo,bmo).  The  iterative  equation  for  the  evolution  of  the  a  pattern  can  then  be  written  as 
[13]:  ■ 


( u  \ 

oq=F  S  am*f(an® a*.,) 

\m  =  1 


(10) 


where  it  has  been  assumed  that  an  aperture  in  the  a  domain  eliminates  extraneous  holographic 
reconstructions.  If  the  nonlinearity  f()  in  the  p  domain  is  faster  than  linear  or  sigmoidal  the 
correlation  peak  sidelobes  will  ter.d  to  be  reduced  on  each  iteration  and  a  will  converge  to  one  of 
the  stored  memories,  assuming  that  the  initial  input  was  sufficiently  similar  to  that  stored 
memory  and  the  hologram  is  not  overloaded. 
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The  results  of  a  computer  simulation  of  this  process  are  shown  in  Fig.  6  in  which  a  sigmoi¬ 
dal  nonlinearity  was  used.  Figure  6a.  is  a  plot  of  the  (J  domain  for  Five  iterations  through  the 
NHAM.  The  corresponding  input  or  a  domain  is  shown  in  Fig.  6b.  The  random  stored  vectors 
were  25  pixels  long  and  the  input  pattern  was  one  of  the  four  stored  vectors  with  the  first  three 
pixels  reversed  in  sign.  Note  the  error  correction  as  the  system  evolves.  The  norm  of  the  vector 
difference  between  a  and  the  stored  vector  is  plotted  in  Fig.  7a,  illustrating  the  convergence  of 
the  NHAM  to  one  of  the  stored  states.  The  monotonically  decreasing  Lyapunov  function  is 
plotted  in  Fig.  7b.  The  radius  of  attraction  of  the  stored  states  can  be  estimated  from  Fig.  8 
which  shows  the  probability  of  convergence  as  a  function  of  the  Hamming  distance  of  the  input 
vector  from  the  nearest  memory  vector.  The  four  curves  are  parameterized  by  the  number  of 
stored  vectors.  The  effects  of  too  many  errors  in  the  input  vector  on  NHAM  performance  are 
evident  in  Fig.  9.  Here  the  number  of  pixels  per  vector  was  reduced  to  eight,  three  of  which  were 
in  error  in  the  input  vector.  In  this  case  the  NHAM  converged  to  the  "wrong"  stored  memory  as 
is  evident  from  Fig.  9a.  The  Lyapunov  function,  of  course,  still  decreased. 

4.  Summary 

In  this  paper  I  have  discussed  a  Lyapunov  function  for  thin  hologram  NHAMs.  Such  an 
NHAM  can  be  considered  to  be  a  BAM  where  patterns  in  one  field  (corresponding  to  the  NHAM 
P  domain)  are  coded  to  prevent  overlapping  of  the  correlation  terms.  The  use  of  angularly 
multiplexed  plane  wave  reference  beams  implements  this  necessary  coding.  Nonlinearities  can 
then  be  used  to  sharpen  the  correlation  peaks  and  undo  the  effects  of  the  blurring  convolution 
operations,  allowing  the  NHAM  to  converge  to  one  of  the  stored  states.  A  single-pass  feed¬ 
forward  NHAM  can  also  be  made  equivalent  to  a  Hamming  neural  network[14]  by  implementing 
a  winner-take-all  competitive  network  in  the  p  domain  using  excitatory  and  inhibitory  local 
interconnections  to  select  the  strongest  correlation  peak.  Generalized  B  AMs  can  be  implem¬ 
ented  in  the  NHAM  framework  if  thick  holograms  are  used,  as  pointed  out  by  Guest  and  TeKols- 
te[15],  although  the  natural  shift-invariance  of  the  thin  hologram  NHAM  is  then  lost  and  special 
techniques  must  be  used  to  avoid  crosstalk[16]  [17].  The  large  storage  capacity  of  a  volume 
hologram  is,  however,  very  attractive  for  neural  network  applications. 
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Figure  D- 1 .  Schematic  of  ghost  imaging  linear  holographic  associative  memory. 


STORED  bm 


Figure  D-2.  NHAM  architecture. 
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(Ha_b  AND  Hb.,  ARE  LINEAR  TRANSFORMS) 


Figure  D-3.  NHAM  flow  diagram. 


Figure  D-4.  Matrix  description  of  forward  pass  through  thin  hologram. 
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Light  Amplitude  Light  Amplitude 


Position  in  beta  plane 


Dynamic  Evolution  of  Object  Reconstruction 


Figure  D-6.  Results  of  NHAM  computer  simulation,  a.  Evolution  of  (5  domain 
over  five  iterations  showing  convergence  to  a  stored  state, 
b.  Corresponding  evolution  of  a  domain. 
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NHAM  Lyapunov  Function  Distance  From  Memory  Vector 


Convergence  of  Output  Pattern 


Figure  D-7.  a.  Vector  distance  of  a  from  nearest  stored  memory  over  Five 

iterations  showing  convergence  to  that  memory,  b.  Corresponding 
decrease  of  Lyapunov  function. 
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NHAM  Convergence  Parameterized  by  M 


Hamming  Distance  of  Input  Vector  From  Nearest  Memory  Vector 


Figure  D-8.  Probability  of  NHAM  convergence  as  function  of  initial  errors. 

The  four  curves  correspond  to  number  of  stored  memories  equal  to 
2, 4, 6,  and  8.  The  length  of  the  memories  was  25. 
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Convergence  of  Output  Pattern 
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Figure  D-9.  a.  Vector  distance  of  a  from  nearest  stored  memory  over  Five 
iterations  for  case  in  which  the  number  of  initial  errors  was  too 
great  for  convergence  to  the  nearest  memory  vector,  b.  The  Lyapunov 
function  strictly  decreased  for  this  case,  as  it  should  for  all  cases. 
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Optical  neural  network  architectures  are  described  which  store  each 
connection  weight  in  a  continuum  of  spatially  distributed  photorefractive  gratings. 
This  approach  reduces  cross-talk  and  fully  utilizes  the  spatial  light  modulator. 
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Self-Pumped  Optical  Neural  Networks 
Yuri  Owechko 

Hughes  Research  Laboratories 
Malibu,  California  90265 

Neural  network  models  for  artificial  intelligence  offer  an  approach 
fundamentally  different  from  conventional  symbolic  approaches,  but  the  merits  of 
the  two  paradigms  cannot  be  fairly  compared  until  neural  network  models  with 
large  numbers  of  "neurons”  are  implemented.  Despite  the  attractiveness  of  neural 
networks  for  computing  applications  which  involve  adaptation  and  learning,  most 
of  the  published  demonstrations  of  neural  network  technology  have  involved 
relatively  small  numbers  of  "neurons”.  One  reason  for  this  is  the  poor  match 
between  conventional  electronic  serial  or  coarse-grained  multiple-processor 
computers  and  the  massive  parallelism  and  communication  requirements  of  neural 
network  models.  The  self-pumped  optical  neural  network  (SPONN)  described  here 
is  a  fine-grained  optical  architecture  which  features  massive  parallelism  and  a 
much  greater  degree  of  interconnectivity  than  bus-oriented  or  hypercube  electronic 
architectures.  SPONN  is  potentially  capable  of  implementing  neural  networks 
consisting  of  105-106  neurons  with  109-10l°  interconnections.  The  mapping  of 
neural  network  models  onto  the  architecture  occurs  naturally  without  the  need  for 
multiplexing  neurons  or  dealing  with  contention,  routing,  and  communication 
bottleneck  problems.  This  simplifies  the  programming  involved  compared  to 
electronic  implementations. 

Previous  optical  holographic  implementations  of  neural  network  models  used 
a  single  grating  in  a  photorefractive  crystal  to  store  a  connection  weight  between 
two  neurons  (each  pixel  in  the  input/output  planes  corresponds  to  a  single 
neuron).  This  approach  relies  on  the  Bragg  condition  for  angularly  selective 
diffraction  from  a  grating  to  avoid  cross-talk  between  neurons.  However,  because 
of  the  angular  degeneracy  of  the  Bragg  condition,  the  neurons  must  be  arranged 
in  special  patterns  in  the  input/output  planes  to  fully  eliminate  cross-talk.  This 
results  in  sub-sampling  of  the  spatial  light  modulators  (SLMs)  and  incomplete 
utilization  of  the  SLMs  if  the  single  grating  per  weight  approach  is  used. 
Specifically,  assuming  the  SLMs  are  capable  of  displaying  NxN3  pixels,  the  single 
grating  per  weight  method  can  store  only  N3'2  neurons  and  N3  interconnections. 

I  describe  here  an  approach  in  which  the  Bragg  degeneracy  is  broken  by 
distributing  each  interconnection  weight  among  a  continuum  of  angularly  and 
spatially  distributed  gratings.  This  eliminates  cross-talk  between  neurons,  making 
sub-sampling  of  the  input/output  planes  unnecessary  and  allowing  full  utilization 
of  the  SLM  space-bandwidth  product.  In  other  words,  N2  neurons  can  be  fully 
interconnected  provided  the  interconnection  medium  has  sufficient  degrees  of 
freedom  or  space-bandwidth  product  to  store  the  N4  interconnection  weights.  By 
forcing  signal  beams  to  match  the  Bragg  condition  at  many  spatially  distributed 
gratings,  the  signal-to-noise  ratio  should  also  be  improved. 

The  continuum  of  gratings  is  generated  by  using  a  self-pumped  phase 
conjugate  mirror  (SP-PCM)  in  conjunction  with  a  SLM,  CCD  detector,  frame 
grabber,  and  host  computer.  Several  theories  have  been  published  for  self-pumped 
phase  conjugation  in  BaTi03  crystals,  including  internal  resonators  based  on  four- 
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wave  mixing  aided  by  Fresnel  reflections  and  stimulated  photorefractive 
backscattering.  A  common  feature  of  these  theories  is  that  each  pixel  in  the 
input  plane  writes  gratings  with  and  pumps  all  other  pixels  to  form  the  phase 
conjugate  wavefront.  This  results  in  a  physical  system  which  is  massively 
interconnected  and  parallel,  and  which  is  a  natural  medium  for  implementation  of 
neural  network  models.  The  distributed  gratings  in  the  crystal  serve  as  the 
interconnection  mechanism  while  the  frame  grabber  in  conjunction  with  the  host 
computer  implements  programmable  neuron  activation  functions.  By  spatially 
segregating  the  input/output  plane,  multiple  layer  neural  network  models  can  be 
implernented.  This  hybrid  system  combines  the  parallelism  and  interconnectivity 
of  optics  with  the  programmability  of  electronics. 

A  diagram  of  an  experimental  system  used  to  demonstrate  these  concepts  is 
shown  in  Fig.  1.  The  "object  plane”  corresponds  to  the  plane  of  neurons 
represented  by  pixels  on  an  LCLV  (liquid  crystal  light  valve).  Activation  patterns 
displayed  on  the  LCLV  are  impressed  on  a  light  beam  which  is  focused  into  the 
SP-PCM.  Connections  between  the  pixels  are  formed  and  the  phase  conjugate 
return  is  detected  by  a  video  camera.  The  return  is  processed  on  a  point  by 
point  basis  by  the  frame  grabber/image  processor  before  being  displayed  again  on 
the  LCLV.  In  neural  network  models  such  as  backpropagation  an  error  signal 
would  be  formed  electronically  and  displayed  on  the  LCLV  to  adjust  the  weights 
between  neurons.  The  error  signals  are  formed  on  a  point-by-point  basis  (local 
operations)  and  so  are  not  computational  intensive. 

An  experimental  demonstration  of  optical  connectivity  using  the  apparatus  of 
Fig.  1  is  shown  in  Fig.  2.  Fig.  2a  shows  the  phase  conjugate  return  for  an 
input  consisting  of  a  complete  resolution  pattern.  The  input  was  then  switched 
to  the  region  enclosed  by  the  dashed  ellipse  in  Fig.  2b.  The  return  consisted  of 
the  complete  resolution  pattern,  as  shown  in  Fig.  2b,  verifying  that  connection 
weights  were  formed  globally  among  all  the  pixels.  Cross-talk  suppression  is 
illustrated  in  Fig.  3.  The  input  to  the  SP-PCM  consisted  of  an  array  of  dots  on 
a  rectangular  grid  (Fig.  3a).  The  conjugate  return  is  shown  in  Fig.  3b.  When 
the  input  was  shifted  even  a  slight  amount,  the  return  disappeared  (Fig.  3c) 
which  verified  that  pixels  do  not  have  to  be  arranged  in  special  patterns  on  the 
SLM  to  avoid  cross-talk.  Finally,  in  Fig.  4  selective  erasure  of  weights  is 
demonstrated.  The  central  neuron  was  deactivated  in  Fig.  4b  by  shifting  the 
phase  of  that  neuron  on  the  LCLV.  This  shifts  the  phase  of  the  gratings  written 
by  that  neuron  and  selectively  erases  connections  between  it  and  the  ether 
neurons,  demonstrating  that  learning  using  bipolar  error  signals  is  possible. 

This  work  was  supported  in  part  by  the  Air  Force  Office  of  Scientific 
Research. 

1.  D.  Psaltis,  J.  Yu,  X.  G.  Gu,  and  H.  Lee,  "Optical  Neural  Nets  Implemented 
with  Volume  Holograms,”  OSA  Topical  Meeting  on  Optical  Computing,  Incline 
Village,  Nevada,  1987,  Paper  TuA3-l. 
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Figure  E- 1 .  Schematic  of  self-pumped  optical  neural  network  apparatus 


(b) 

Figure  E-2  Demonstration  of  connectivity  of  self-pumped  PCM. 
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Figure  E-3.  Demonstration  of  cross-talk 
suppression  in  self-pumped 
optical  neural  network. 


Figure  E-4.  Demonstration  of  selective 
weight  erasure. 
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Photorefractive  Optical  Neural  Networks 
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Neural  network  models  for  pattern  recognition,  clustering,  and  optimization 
offer  an  alternative  approach  compared  to  conventional  statistical  methods,  but 
without  a  unifying  theory  the  performance  of  the  two  paradigms  cannot  be  fairly 
compared  until  neural  network  models  with  large  numbers  of  "neurons”  are 
implemented  in  dedicated  hardware.  Despite  the  attractiveness  of  neural  networks 
for  computing  applications  which  involve  adaptation  and  learning,  most  of  the 
published  demonstrations  of  neural  network  technology  have  involved  relatively 
small  numbers  of  "neurons”.  One  reason  for  this  is  the  poor  match  between 
conventional  electronic  serial  or  coarse-grained  multiple-processor  computers  and 
the  massive  parallelism  and  fine-grain  communication  requirements  of  neural 
network  models.  Approaches  currently  being  pursued  for  dedicated  hardware 
implementations  include  special  purpose  digital  and  analog  integrated  circuits  as 
well  as  hybrid  optical/electronic  architectures. 

In  my  talk  I  will  discuss  holographic  neural  network  architectures  in  which 
the  connection  weights  between  neurons  are  implemented  as  gratings  in  a 
photorefractive  crystal.  In  particular  I  will  discuss  the  self-pumped  optical  neural 
network  (SPONN),  which  is  a  fine-grained  optical  architecture  which  features 
massive  parallelism  and  a  much  greater  degree  of  interconnectivity  than  bus- 
oriented  or  hypercube  electronic  architectures.  Connections  between  neurons  are 
implemented  as  sets  of  angularly  and  spatially  multiplexed  volume  phase  gratings. 
SPONN*  is  potentially  capable  of  implementing  neural  networks  consisting  of 
105-106  neurons  with  109-1010  interconnections.  The  mapping  of  neural  network 
models  onto  the  architecture  occurs  naturally  without  the  need  for  multiplexing 
neurons  or  dealing  with  the  contention,  routing,  and  communication  bottleneck 
problems  of  electronic  parallel  computers.  This  simplifies  the  programming  of  the 
optical  system. 

An  alternative  approach  to  optical  holographic  implementations  of  neural 
network  models  utilizes  a  single  grating  in  a  photorefractive  crystal  to  store  each 
connection  weight  between  two  neurons  (each  pixel  in  the  inpu.t/output  planes 
corresponds  to  a  single  neuron).1  This  approach  relies  on  the  Bragg  condition  for 
angularly  selective  diffraction  from  a  single  grating  to  avoid  cross-talk  between 
neurons.  However,  because  of  the  angular  degeneracy  of  the  Bragg  condition,  the 
neurons  must  be  arranged  in  special  patterns  in  the  input/output  planes  to  fully 

eliminate  cross-talk.  This  results  in  sub-sampling  of  the  spatial  light  modulators 

(SLMs)  and  incomplete  utilization  of  the  SLMs  if  the  single  grating  per  weighty 
approach  is  used.  Specifically,  assuming  the  SLMs  are  capable  of  displaying  N*, 
pixels,  the  single  grating  per  weight  method  can  store  only  N3'2  neurons  and  N- 

interconnections.  In  my  talk  I  will  describe  the  SPONN  approach  in  which  the 

Bragg  degeneracy  is  broken  by  distributing  each  interconnection  weight  among  a 
continuum  of  angularly  and  spatially  distributed  gratings.  This  eliminates  cross¬ 
talk  between  neurons,  making  sub-sampling  of  the  input/output  planes  unnecessary 
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and  allowing  full  utilization  of  the  SLM  space-bandwidth  product.  In  other 
words.  N"  neurons  can  be  fully  interconnected  provided  the  interconnection 
medium  has  sufficient  degrees  of  freedom  or  space-oandwidth  product  to  store  the 
N"‘  interconnection  weights.  By  forcing  signal  beams  to  match  the  Bragg 
condition  at  many  spatially  distributed  gratings,  the  signal-to-noise  ratio  should 
also  be  improved. 

The  continuum  of  gratings  is  generated  by  using  a  self-pumped  phase 
conjugate  mirror  (SP-PCM)  in  conjunction  with  a  SLM,  CCD  detector,  frame 
grabber,  and  host  computer.  Several  theories  have  been  published  for  self-pumped 
phase  conjugation  in  BaTi03  crystals,  including  internal  resonators  based  on  four- 
wave  mixing  aided  by  Fresnel  reflections  and  stimulated  photorefractive 
backscattering.  A  common  feature  of  these  theories  is  that  each  pixel  in  the 
input  plane  writes  gratings  with  and  pumps  all  other  pixels  to  form  the  phase 
conjugate  wavefront.  This  results  in  a  physical  system  which  is  massively 
interconnected  and  parallel,  and  which  is  a  natural  medium  for  implementation  of 
neural  network  models.  The  distributed  gratings  in  the  crystal  serve  as  the 
interconnection  mechanism  while  the  frame  grabber  in  conjunction  with  the  host 
computer  implements  programmable  neuron  activation  functions.  By  spatially 
segregating  the  input/output  plane,  multiple  layer  neural  network  models  can  be 
implemented.  This  hybrid  system  combines  the  parallelism  and  interconnectivity 
of  optics  with  the  programmability  of  electronics. 

A  diagram  of  an  experimental  system  used  to  demonstrate  these  concepts  is 
shown  in  Fig.  1.  The  "neuron  plane"  is  an  optical  representation  of  the  neuron 
activity  levels  on  a  spatial  light  modulator,  in  our  case  a  LCLV  (liquid  crystal 
light  valve).  Activation  patterns  displayed  on  the  LCLV  are  impressed  on  a  light 
beam  which  is  focused  into  the  SP-PCM.  Connections  between  the  pixels  are 
formed  and  the  phase  conjugate  return  is  detected  by  a  video  camera.  Th~. 
return  is  processed  on  a  point  by  point  basis  by  the  frame  grabber/image 
processor  before  being  displayed  again  on  the  LCLV.  In  neural  network  models 
such  as  backpropagation  an  error  signal  would  be  formed  electronically  and 
displayed  on  the  LCLV  to  adjust  the  weights  between  neurons.  The  error  signals 
are  formed. on  a  point-by-point  basis  (local  operations)  and  so  are  not 
computational  intensive. 

An  experimental  demonstration  of  optical  connectivity  using  the  apparatus  of 
Fig.  1  is  shown  in  Fig.  2.  Fig.  2a  shows  the  phase  conjugate  return  for  an 
input  consisting  of  a  complete  resolution  pattern.  The  input  was  then  switched 
to  the  region  enclosed  by  the  dashed  ellipse  in  Fig.  2b.  The  return  consisted  of 
the  complete  resolution  pattern,  as  shown  in  Fig.  2b,  verifying-  that  connection 
weights  were  formed  globally  among  all  the  pixels.  Cross-talk  suppression  is 
experimentally  demonstrated  in  Fig.  3.  The  input  to  the  SP-PCM  consisted  of  an 
array  of  dots  on  a  rectangular  grid  (Fig.  3a).  The  conjugate  return  is  shown  in 
Fig.  3b.  When  the  input  was  shifted  by  half  of  the  grid  period,  the  return 
disappeared  (Fig.  3c)  which  verified  that  pixels  do  not  have  to  be  arranged  in 
special  patterns  on  the  SLM  to  avoid  cross-talk  in  the  SFONN  approach. 

Finally,  in  Fig.  4  selective  coherent  erasure  of  weights  is  demonstrated.  The 
central  neuron  was  deactivated  in  Fig.  4b  by  shifting  the  phase  of  that  neuron  on 
the  LCLV.  This  shifts  the  phase  of  the  gratings  written  by  that  neuron  and 
selectively  erases  connections  between  it  and  the  other  neurons,  demonstrating  that 
learning  using  bipolar  error  signals  is  possible. 
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Issues  which  need  to  be  addressed  in  the  SPONN  approach  include  the 
partial  erasure  of  old  recordings  by  new  ones  and  the  volatility  of  the  gratings 
(gratings  are  partially  erased  by  the  readout  process).  Partial  erasure  can  be 
compensated  bv  using  an  exposure  schedule  in  which  early  recordings  are  made 
with  larger  exposures  than  later  ones.*  Grating  volatility  may  be  possibly 
eliminated  by  "fixing”  the  gratings  using  switching  of  ferroelectric  domains  in  such 
a  way  as  to  transfer  the  charge  pattern  in  the  crystal  to  the  domain  pattern, 
which  is  permanent  at  room  temperature. 

This  work  was  supported  in  part  by  the  Air  Force  Office  of  Scientific 
Research. 
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Figure  F- 1 .  Schematic  of  self-pumped  optical  neural  network  apparatus. 
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Figure  F-2  Demonstration  of  connectivity  of  self-pumped  PCM. 
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Figure  F-3.  Demonstration  of  cross-talk 
suppression  in  self-pumped 
optical  neural  network. 
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Figure  F-4.  Demonstration  of  selective 
weight  erasure. 


